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ABSTRACT 



Each programming language that handles data structures 
has its own set of rules for working with them. Notions 
such as assignment and construction of structured values 
appear in a huge number of different and complicated ver- 
sions . This thesis presents a methodology which provides a 
common basis for describing ways in which programming lan- 
guages deal with data structures and references to them. 
Specific concern is paid to issues of sharing. 

The methodology presented here consists of two parts. 
The base language model, a formal semantic model introduced 
by Dennis , is used to give the work here a precise founda- 
tion. A series of "mini-languages" are defined to, make it 
simpler and more convenient to express -_**«§ describe the 
semantics for a variety of constructs found in contemporary 
programming languages. 
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Chapter 1 
INTRODUCTION 

1.1. General Goals 

Students of computer science are confronted at a very 
early stage with a great variety of general-purpose pro- 
gramming languages. Descriptions of these languages place 
heavy emphasis on common features such as assignment, pro- 
cedures, conditionals, input/output and block structure. 
Aside from variations in notation, there are numerous rules, 
exceptions and special cases which make for differences be- 
tween comparable constructs in different languages. For ex- 
ample, the body of a DO-loop in FORTRAN must be executed at 



FORTRAN 


PL/1 


N = 1 




N = 1; 


DO 50 I =» 2,N 




DO I = 2 TO N; 


[body] 




[body] 


• 

50 CONTINUE 




• 

END; 


body executed once 


; ■ 


body not e*e<?a%fed 


Fig. 1.1-1. Looping 


feature in two languages 



^'^^^^^^^^^-%^^: 
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least once, while in PL/1 it is to be skipped if the index 
is out of range (figure 1.1-1). Such differences can be 
studied by examining the semantics of different programming 
languages, The semantics of a programming language is the 1 
study of the meaning of its constructs, or in other words 
the effect of executing programs in the language. The par- 
ticular concern of this thesis is the notion of data struc- 
tures and the semantics pertaining to them as they appear 
in programming languages. 

T^re are many areas of application in, which the use of 
structured data is both helpful and convenient,. 4n „ ..pyobl^m^. 
solving. Some example areas are symbol manipulation, artlr 
ficial intelligence, computer graphics, and simulation stu- 
dies; ^her ally speaking, a data struc,^e r 4a an aggregate 
data <m^cV^^M.Tilnq otli^x data objects as components. 
Typical instances ££ data structure* include arrays, sequen- 
ces, vectors, tuples and lists. We will not dwell on the 
characteristics peculiar to each of these different vari- 
eties of data structure* our emphasis 1*11! : be on more gener- 
al pr©per|^.e# .rielaMjag;*Q data struetur«« and their compon- 
ents . 



Typically, a programming language provides two basic 
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operations for handling data structures: component objects 
of a data structure can be individually accessed and manip- 
ulated, and data structures can be constructed from desig- 
nated objects as components. These operations interact with 
the assignment operation of a programming language in per- 
forming several other tasks, such as assigning structured 
values to identifiers, or updating components of a struc- 
ture. There is a great similarity in appearance among con- 
structs for performing such tasks in various programming 
languages, on the surface, from a casual examination of 
language descriptions, distinctions between analogous con- 
structs in different languages appear to be. mostly notation- 
al. But we shall see important semantic distinctions, par- 
ticularly in the area of data being shared between different 
structures. 

Since each programming language has its own set of 
rules for dealing with data structures and sharing, it is 
desirable to seek a rigorous method for describing what 
happens. Our goal, then, is to gain a more precise under- 
standing of the semantics of data structures. This will 
provide a unified and coherent viewpoint for describing the 
different approaches to data structures as they are found in 






* 
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programming languages, we will pay specific attention to 
the difficult and important issue of properties of sharing. 
These issues depend ultimately on the concepts of cells 
(which model computer memory locations) and references to 
cells. References are also commonly known as pointers . We 
will first discuss general questions of programming language 
semantics, and then move towards a more specific treatment 
of data structures and references. 

1.2. Background on Formal Semantics '•'■■,'• 



vV ■ 



A programming language provides a notation in which the 

programmer can model computational processes and the in for- ' , 

-■<-■-■■ , 

mation on which they operate. Programming language seman- |' 

tics deals with the relationship between programs and the 

objects they represent. A formal semantics for a programra- 

* ing language is a precise description of such a relation - 
t i 
<! v ship. Thete has been much s£udy of formal semantics of pro- 

y gramming languages. Wegner (Weg 72a] distinguishes three 

classes o£ formal semantic models: 

(D Abstract semantic models . In this approach, the 
objects being modeled are treated as mathematical entities 
independent of any particular representation. Models of 
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this class aim towards providing a formal mathematical de- 
scription of the computational notions being studied. One 
well-known example of this approach to semantics has been 
the use of the lambda, calculus as a semantic model for pro- 
gramming languages . The lambda calculus , which is described 
in [Der 74, Morr 68, Weg 68]\ is basically a mathematical 
formalism for the definition and application bf functions. 
It is ideally suited for describing so-called applicative 
features of programming languages, such as evaluation of ex- 
pressions, use Of procedures, and block structuring. Landin 
demonstrated its usefulness in these areas [Lan 64] and pre- 
sented a scheme for extending the .lambda calculus formalism 
to model the language ALGOL 60 [Lan 65]. More recently, 
different extensions of the lambda calculus have been de- 
vised for describing 'data types [Reyn 73]. 

A second major example of the abstract approach to se- 
mantics is found in the work of Scott [Scot 70, Scot 71]. 
Scott makes use of the mathematical -theory of lattices 
[San 73] to construct sets which are the domains of func- 
tions that represent the behavior of program/. The Scott 
formalism has been used recently to describe the semantics 
of ALGOL 60 [Mos 74]. * ■* » 



?f^*^^/T#^:^ -'■.:■ •■■■■• >■■■; >-r--'-;*<»*---'.*.? y--*?.?^*#r^.-^^ w*-;ii :v.-*£*tf vivi^^^ 
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]We can briefly summarize abstract semantic models by saying 
that they characterize the action of programs as functions ' 
over various domains. 

(2) jwwt-qatpyf models. Hade la of this class use 

statements of mathematical logic as aaee«feio^8, r ,abQut the 
state of a computer system at various points during the ex- 
ecution of programs on it. The . sejatsrifci?*; of. :; a, program is 
viewed as the relation between input _ assertions' (the state,' 
Of the system before execution) and output |*sej:tians (the, 
state after the program is run). This a^rciach to semantics, 
more frequently called the axiomatic .approach,.,,, was .devjeloped 
by Floyd fFloy §7] and Hoara [Hoar 69, H&ar Jjy r Inhere ^ has 
been much farther work on it. Axiomatic; semantics- is jaoat , 
useful in proving correctness of, programs. ,i,.e.. f eetabliAhing 
that the effect of executing a program fulfills mathematical 
conditions the program is supposed to satisfy. 

(3) 9^f^pnffl. models. This approach to .semantics 
concerns itself specifically wi^h modeling the changing . .^ . 

states of a compute* system perf ormiiig computations . Such a 

I 
task is usually accomplished by means,, of., a «t§te?*ransition 

system, in which a state of the model represents fehe infor- 
mation in the computer system at a given time. The effect 
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of a program on its input data is reflected in the sequence 
of transitions of the model. It is important to observe 
that given a state-transition system corresponding to some 
program, the sequence of states that models^ the execution of 
this program defines the action of an interpreter for the 
program. For this reason, the approach to formal semantics 
using operational models is called interpretive semantics . 

We can describe the way in which an interpretive seman- 
tic model gives the semantics for a program written in some 
source language. A translator transforms the program into 
an equivalent program in another language which we call an 
abstract language . Programs in an abstract language are 
acted upon by an interpreter; this action results in a 
sequence of state transitions of the model. The semantics 
of the original source-language program is given by such a 
sequence of transitions. One reason we make use of trans- 
lators is that source programs are usually represented as 
character strings rather than as data objects suitable for 
processing by the interpreter. 

Although the use of interpreters to implement pro- 
gramming languages was (and still is) commonplace, McCarthy 
[McC 62] was the first to use an interpreter to defin e a 
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language (LISP) . The semantics of LISP is given formally by 
an interpreter written in LISP. Landin [Lan 64, Lan 66b] 
uses an interpreter called the SBCD machine to define the 

..■.'■■A : ■:■ ■■■ I V' ' 

lam^a calculus, even though the lambda calculus is a mathe- 
matical formalism with a rigorous definition of its own. A 
more recent discussion of definitional interpreters is found 
in [Reyn 72] . 

Of these three approaches to formal semantics of pro- 
gramming languages, the interpretive approach is best suited 
for our goals of understanding the semantics of data struc- 
tures and references. In order to properly explain the se- 
mantics of a program that handles data structures, we will 
need to know how the data structures are formed, their com- 
position „ the relationships between the structures and their 
components, sharing properties, and other items of infor- 
mation. The best way to get a handle on this kind of infor- 
mation ^s to consider the state of the system at various 
moments during the execution of the program. The interpret- 
ive approach is the only one which lends itself directly to 
working with states of the system. Both of the other 
approaches are better suited for proving assertions about 
program* a»a establishing their correctness? but these 
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issues are outside our main concern here. A treatment of 
data structures from the viewpoint of axiomatic semantics 
may be found in [Lav 74]. We will work towards developing 
an interpretive model to be used as a semantic foundation 
for dealing with the important issues of data structures and 
references. 

The most prominent interpretive model for semantics is 
the VDL model. VDL, the Vienna Definition Language, is a 
metalanguage for writing interpreters of programming lan- 
guages. VDL interpreters have been written for languages 
such as ALGOL 60 [Lau 68], PL/1 [Walk 69, Luc 69], BASIC, and 
PDP-8 machine language [Lee 72]. An elementary introduction 
to VDL may be found in [Weg 72b]. Just as. LISP works with 
lists, VDL works with tree- like data objects (which we call 
labeled trees) . The basic operation of the VDL model is as 
follows: for each source language whose semantics we wish 
to describe, we define a translator and an interpreter. The 
translator transforms a source language program into an ab- 
stract program , which is a form of labeled tree suitable for 
manipulation by the interpreter ( for each source language 
the corresponding abstract language will be Some set of 
labeled trees; the structure of an abstract program varies 
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from language to language) . The interpreter, which consists 
of VDL code, accepts a labeled tree as input and interprets 
the effect of the program on its input data. For different 
languages, different interpreters are defined. 

The fact that VDL uses treelike data objects reduces 
its desirability as a semantic model for our work on data 
structures. We will be studying data structures in which 
components may be shared between different objects; VDL's 
labeled trees do not directly admit altering of any kind. 
Thus in order t& model in VDL structures such as we will 
study, it would be necessary to go through the inconvenience 
of simulating the memory of a computer. Since the study of 
sharing is fundamental to our work, it is desirable to work 
with objects in which sharing is represented directly. We 
therefore prefer for our goals' a semantic model that . 
manipulates data objects of a more general nature than VDL's 
labeled trees. 

ih P eftn 71 J » P? nni s outlines an injter^retive : : «j*m*n,tic. 
inode 1 called the base language Jggj^. ;1 The, data pbj eets man?? 
ipulated by this model are variants of diregt«d graphs and 
can directly model sharing. As with VDL, for each language 
whose semantics we wish to describe, we must specify a 
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translator which transforms programs in the language into 
data objects suitable for consumption by the model. These 
objects are called procedure structure in the base language 
model. Procedure structures, like VDL'e attract programs, 
are acted upon by the interpreter to produce state tran- 
sitions. But the base language model differs from VDfc in 
that the composition of a procedure structure generated by 
the translator from some source program does not depend on 
the language in which the program was written. As a result, 
there is no need to define a separate interpreter for each 
programming language. There is a single, pre^upplied in- 
terpreter for the base language model which accepts arbit- 
rary procedure structures and interprets, them as programs. 
Thus we see that the translators for^be base language model 
translate programs from their respective source languages 
into a single, common language. We call t^s language the 
base language . A procedure structure represents a program 
in the base language, which consists of a segjience of in- 
structions. The individual base language instructions spec- 
ify the fundamental state transitions of the model, 

in order to achieve the language-independence of the 
interpreter in the base language model, the .translators; must 
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do im>re wprk than their VDL counterparts. A VDL translator 
simply converts a program from character string to labeled 
tree, while a translator f or the J5*Re language model must 
perform .fun^tioiN*- similar to those ©f a compiler. Thus, 
once we specify the semantics of the base language, i.e. 
decide on a formal specif ieation of tike* actions performed by 
the interpreter in the base language model, the semantics of 

a partieu.lar itfo^caflwing languag* is ^l#terrained by its 

i 
t^JMaslation into the base language. 

The base language model is extremely veil suited for 
our work. The primitive instructions of th* base language 
are particularly convenient for manipulating structured ob- 
jects and dealing with sharing. We can view the base lan- 
guage as the machine language for a computer with heap- 
structured memory and symbolic address space . I n this r e- 
spect, progcamfl ih the base language will be similar to con- 
ventional assembly language programs . This similarity is a 
source of furthe* convenience in using the base language as , 
a progjeamteing tool. 

Amerasingbe [Amer 72] described the translation of a 
block-structured language BLKSTRUd into the base language. 
Xft^PGK0KJfe-;' procedures are "first-class objects" [Stra 67] 
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which can be used in contexts as general as objects of other 
types. BLKSTRUC's treatment of procedures is more general 
than ALGOL 60' s. The action of a translator for a language 
with non-local goto's is described in [Amer 73], Trans- 
lators for the languages SNOB0L4 and Simula 67 are discussed 
in [Dra 73] and [Cou 73]. These works show the use of the 
base language model in describing the semantics of various 
powerful programming languages. We will be using a version 
of the base language model as the semantic foundation for 
our study of data structures. 

1.3. Plan for the Thesis 

We outline here the topics covered in the rest of this 
thesis, chapter 2 describes the base language model as we 
will be using it. The action of the interpreter is given by 
describing the effect of the instructions of the base 
language. The approach in Chapter 2 is informal; a more 
rigorous treatment is found in the Appendix. Once the be- 
havior of the base language interpreter is known, we have a 
handle on the semantics of the programming-language con- 
structs that interest us. All that will then need to be 
done to supply a formal semantic definition is simply to 
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describe the action of a translator Which produces base lan- 
gu age code . 

In the remainder of this thesis we will be using the 
base language mode! as a semantic foundation for describing 
the different ways various programming languages deal with 
data structures . We want to make clear distinctions between 
comparable constructs in different languages. Although the 
semantics of data structuring constructs can be precisely 
expressed by using the base language model, there is a cer- 
tain respect in which the model is less than ideal as a de- 
scriptive vehicle. Data structure* »« fcfciey are found in 
programming languages are tied up with the notions of var- 
iables and values. We would like to make use of these 
notions in talking about the semantics of data structures. 
But the descriptive level of the base language is only 
equipped for talking about primitive transformations on the 
objects which comprise the interpreter states, in this 
sense the base language is too "low- level "for describing 
data structures in a manner suitable for our purposes. 

To provide a better descriptive mechanism, we will 
follow the approach taken by Ledgard [Led 71] in defining a 
series of "mini-languages." Mini*languages provide de- 



■* ««•£»■*;■ ~^p*^ -, 



-19- 

scriptive levels appropriate to our needs, yet at., the same 
time avoid the syntactic and semantic complexity of full- 
scale programming languages. The primary advantage of the 
mini-language approach is that we can isolate the concepts 
we wish to describe by eliminating ail the conceptually ex- 
traneous notions that are needed in a full-size language. 
Accordingly, in a mini-language for describing data struc- 
tures, there are no procedures, Conditional express ions, . , 
loops, goto' s or operators. Mini-languages are not meant to 
be viable languages for actual programming; they ; are us ed 
for descriptive purposes only. The syntax ,an^ semantics Qf 
a mini-language are simple enough to be readily understood 
on an informal basis; the semantics can then be formalized 
by specifying translation into the base language. In this 
manner, the semantics of data-structuring constructs in full- 
scale programming languages can be given by describing how 
to express these notions in a suitable mini-language. 

Chapter 3 presents mini- languages for describing the 
notions related to assignment, data structures, pointers and 
sharing. These mini-languages are then used to describe the 
data structuring semantics of several full-scale programming 
languages . 
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Ih Chapter 4, we treat the additional notion of static 
typechecking, which has a direct hearing on the semantics of 
data structures in many important programming languages. 
This notion of static typechecking differs from Ledgard's in 
that it deais with structured types, where Ledgard [Led 71] 
deals with functional types and the types of arguments and 
returned values. As in chapter 3, we treat the data struc- 
turing facilities of three full-sise languages; in 
these languages the concept of static typechecking is di- 
rectly tied in with the semantics of data structures (spe- 
cifically assignment). 

Chapter 5 presents a summary of what we cover in this 
thesis and suggests extensions for further study. 
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Chapter 2 
THE BASE LANGUAGE MODEL 

2.1. Overview of the Model 

We have chosen as the semantic foundation for pur work 
a version of the base language model set forward in jDenn 71] 
and [Amer 72]. The base language model centers around a 
base language interpreter , which is ...essentially a ftate- 
transition system that we shall use to express the meaning 
of computations. The interpreter specifies the behavior of 
an entire computer system. We reprftept .^ coppjtation by a 
sequence of interpreter states. A state of the interpreter 
will be a certain kind of mathematical object embodying the 
information contained in the computer system at a partic- 
ular point in time. We shall define a base language called BL 
each of whose programs consists of a sequence of instructions. 
Each instruction specifies a functional transformation be- 
tween interpreter states. The language BL is adapted from 
the rudimentary language described by Dennis in [Denn 71]. 

We represent interpreter states by mathematical ob- 
jects known as BL-graphs. Suppose we are given a set ELEM 
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of elementary objects and a set SEL of selectors . (For our 
purposes, ELEM consists 6 f integer *, real numbers and 
strings; SEL consists of integers and strings.) Then a 
BL-graph is a variant form of directed graph; it consists of 
nodes and arcs. Each arc connects two nodes in a specified 
direction and is labeled With a selector. We may associate an 
elementary object with each node from which no arcs lead 
out. There must alio be a distinguished Subset of the 
nodes (called the root nodes ) frbm which eaich node of 
the graph can be reached along some directed path of arcs. 
we give a formal mathematical definition of 1 BL-graphs in the 
Appendix. '" ' 

A BL-graph with a single root node is called a BL-obiect . 
We identify a BL-object by its root node. Specifically, 
for any node a in a BL-graph G, we* associate with a the sub- 
graph of G whose nodes and arcs are accessible from a. This 
subgraph is a BL-graph with a as its root node; we call it 
the ob ject <?f a. 

If there is a directed path from one node of a BL-graph 
to another node, then the second node is called a descendant 
of the first node. All nodes in a BL-*graph are descendants 
of some root node. A node from which no arcs emerge is 
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called a leaf node. An elementary object attached to a leaf 
node is called the value of that node, if there is an arc 
from a node a to another node 0, then p is called a com- 
£onent of a, and the object of p is called a component of 
the object of a. Components are named by the selectors on 
the arcs leading into them, if a * object is a component of 
two distinct objects, it is said to be shared between them. 
Nodes in a BL-object are denoted by pathnames . A pathname- 
for a node is a sequence of selectors labeling a directed 
path to that node from the root node, if *«,. object of a 
node is shared, then the node will have distinct pathnames.- 
The property of sharing is of major significance; we will 
have much to say about it. 

We will be making heavy use of pictorial representa- 
tions of BL-objects. An elementary object is drawn as an 
encircled value (figure 2.1-1) . 
For a general BL-object, the * I 
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Pig. 2.1-1. Sample 
elementary ob jects 



nodes are drawn as heavy dots. 
The root node is at the top. 
Arcs emerging from a node are 

drawn downwards from a horizontal line attached to the node. 
Selectors are written across the arcs that they label. if a 
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selector is a string, we do not enclose it in quotes. Elem- 
entary objects attached to root nodes hang downwards from 
them. Thus our pictorial conventions for BL-objects differ 
slightly f torn those used in [Denn 71]. 

Sample BL-objects are pictured ^n figures 2. 1-2 and 
2.1-3. The object in figure 2.1-2 has £hre« components, 

named k,o a^4 a, f .The G-corapoj*- 

ent is emp^r, .^J*be ^component 
has two cpi|sp<^ie|»tfly both of 
. , which are. jLe^f, no<$esu ; . The 'leaf 
node with y*J,u« ri 9 hfcS: |mt^mame 
k . c . The A«af node with value 
'hi' is shared between nodes k 
and a and has path- 
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Fig. 2.1-3 1 . A sample BIi-objec€ 



names k.u and a. 6. In 
figure 2 . 1-3 , the ob- 
ject with value 1.61s 
shared between the ob- 
jects e.b and s and 
has pathnames s . b . 5 
ahd S.4. ^he object 
of node c xs shared 
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between the object of the root node and the object c.y. 
Since the node c is a descendant of itself, it has infin- 
itely many pathnames c, c.y. 2, c.y.2.y,2, c.y. 2 .y,2.y.?, and 
so on. The path joining this node to itself is a directed 
cycle . 

A basic difference between out BI,-graphs and the graphs 
of [Denn 71] is that Dennis does not allow directed cycles 
in his objects. Cycles seem to impair the management of 
storage and the handling of parallelism in computation. 
However, cycles occur in many of the structures we shall be 
modeling. Moreover, they are difficult to detect and re- 
move (see [Amer 72] for more details on the proems of 
cycles \, we shall therefore not rule out cycles here. 

We follow [Denn 71] in giving the structure of a BL- 
object which represents a state of the interpreter. An 
interpreter state is a BL-object having three components as 
follows: 

(1) The universe--component models system-resident in- 
formation, both data and procedures. Generally speaking, 
this information is independent of which computations are 
currently active or how far various computations have pro- 
gressed. 
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(2) The loc&l-structure -'component of an interpreter 
state has as components a series of activation records for 
the various procedures being interpreted in the system. 
These* components are called local structures ; there is one 
local structure for each activation of each base language 
procedure. A local structure represent* the environment for 
its activation j primarily identifiers and their associated 
values. Thus the local-structure component of an inter- 
preter state records the progress of computations by model- 
ing their changing environments. 

( 3) The contr cH-<;oinponent has .as components a number of 
sites of <, MStpk.wipv, t : which indicate for each current compu- 
tation the next instruction to be executed, the appropriate 
environment (local structure) for the cojB^ut«ation, and other 
information. 

We shall not go into the details here of representing 
the universe- and control- components of interpreter states. 
The interested, reader can consult the JVppendix for- that kind 
information. We will be dealing almost exclusively with 
local structures in the remainder of this chapter. In the 
next section, we describe the action of a number of 
primitive BL instructions. 
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2.2. B ase Language Instructions 

We introduce the primitive instructions of BL, Which 
define state transition* of the interpreter^ in our model. " 
Each BL instruction executed by the interpreter belongs- to : 
some procedure written in Bli: and is Interpreted during a n 
activation of the procedure. We cMl ; «« JU><^i «tru«ture 
corresponding to this actiyaj^on the current local s tructur e 
^ c ' 1 - 3 '\ for the instruction. : . * 

A BL inatrnct ion consists of ^»n oper- * 
ation code and up to three opejcidas. The 
operation code is underlined. Hpst i»« the operand* ofi 
the various instructi©na?%re selector*, which are frequentl^ 
used to denote names of components of 'the root node of the i 
cl.s. we reserve the Setters *, y, and z for selector 
names used in this fashion* - 

We shall give informal descriptions of the effects of 
BL instructions, accompanied by sample "before" and "after" 
diagrams of the cl.s. A more formal definition of these 
instructions may be found in the Appendix. 

Each instruction is designed to perform a ..specific 
function in changing the cl.s. This is called the primary 
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role (or, more simply, the ro;|e> ^,gi;-fel^ ; :'.in8::tru.ction t and de~ 
pends on certain conditions being fulfilled (e.g. the pres- 
ence or absence .-of^ajieciflG ■QmB^^Bma^./im.-^m.-c. l.a.) . . : .The f 
e f ;f . ect .' of .-an. jjutattUS$4en , when , such condi t* orvs , Acr- ■ not . hold is 



called a 



.effect, :&£>: 



i' 



The create ijistructiOn •■£»• use&-* ; #o' :; ereate' a new cora- 

poneat* ' 'it* • t&fir. e . 1 * s . '' 'Provided 
that the c.l.s. has no x-eomponent , 

;\t^;p#^tt«^'iP©l«7:b;f:;the iiiifertie- ' 
tion create ■ x is to add one. 
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Fig. 2.2*-l. role of 

:;xy. ..,.-.-< 



oyeat 



jwp^i^W^^ M.^jJ Iiw U iiiii M iiiii n. ii^ 



\0im9*t1&twti&:i&--jiito eH#tyvieaf 5 
. nod***,: -;S:fg *&■£:■«*! . e< :- already- : has 

. anv.x«-c<wflp©a«ii^ -.'then .'the ■ in- 
struction creat e x has a subsidiary '-•Jffect ;,of changing the 
arc with «alectpj? x from the root node to point to a newly , 
allocated node. For this subef feet the former x-compqnent 
node will remain as part of the c.l.s. only if it was shared 
With some other node. Figures 2.2-2 through 2.2-4 illus- 
trate subef feet* of the instruction create x and its in- 
terplay with the sharing property. Portions of a diagram 
mtget$a9&& in -dotted lines are no longer part of the c.l.s. 
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and can be thought of as garbage-collected. 




Fig. 2.2-2. A subeffect 



of 



create x 
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Fig. 2.2-3. A subeffect 

■ of '•' ■ ■ create' x 




Fig. 2.2-4. A subeffect 
of create x 
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Fig. 2.2-6. Role of 
clear x 
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Fig. 2*2^7. A subeffect 



of 



clear x 
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The clear instruction is used to make a node empty; 
clear x detache* whatever hangs downward from the node x? 
leaving x with an empty value. The old value of x is lost 
even if it was shaded with some other hode. Figures 212-5 
and 2.2-6 illust4ate the role of clear x. if there is no 
x-eotaponent in, the o.l.s. , : clear ■ x acts- like create x 
and generates on© (fig. 2.2-7). 

The delete instruction removes arcs from the c.l.s. 
The arc from the root node to the node x is removed by the 
instruction delete x (figs. 2.2r-8 and 2.4*4}. The arc 
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Fig* 2.2-8. Role of 
delete x 
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with selector m from the node x is removed by the two- 
operand form delete x.m (figs. 2.2-10 and 2.2-11). if 
an are to be removed does not exist, then .JHm* sufeef feet of 
the- delete instruction is that no action be taken. 
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Fxg. 2.2-10. Role of 
delete x,m 
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Fig. 2.2-11. Role of 
delete x.m 



The const instruction is used to attach elementary ob- 
jects to nodes, if v is any elementary object, then 
const v,x causes the value v to be attached to the node x. 
The old value of x, if any, is lost. Figure 2.2-12 illus- 
trates the role of the instruction const 5,x (where x is 
a leaf node), and figure 2.2-13 shows a subeffect of the 
same instruction (for the case when x is not a leaf node) . 
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Fig. 2.2-12. Role Of 
const 5,x 
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Fig. 2.2-13. Subeffect of 
COUSt 5 :. X 
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Arithmetie instructions such as -add,, suffer t mult ««<* 
div are used to manipulate elementary values. For example,' 

the instruction add x,y, z 
adds the values attached to 
nodes x and y and places the sum 
in node £ (figure 2: 2-14) . It 
is an error to attempt to ex- 
ecute an arithmetic instruction 
if one of the first two operand nodes fails to exist or con- 
tains an improper value (not a leaf node or empty or wrong 
type of elementary object) . We leave the effect of Such an 
attempt undefined. 




Fig. 2.2-14. Role <$£ 
add x,y,as 



the linft ^Instruction is used to initiate sharing be*- 
tween nodes. The instruction lirflc x,n,y t «pMi^ the node 



y to become the n-component of x (so that y will be shared 
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between the node x and the root node). This is done by add- 
ing an arc with selector n from node x to- node y. Figures 
2.2-15 and 2.2-16 illustrate the role of the instruction v 
link x,n,y. If x already has an x-component or is a le^af 
node with some elementary value, then the subef feet of the 
same instruction causes the old value of x to be lost (figs. 
2^2-17 and 2.2-18) . The nodes for x and y must be present 
or else the instruction is illegal. 




Fig. 2.2-17. SubeffepiJ 
of '■:: lick x.a.v 




^Fig-,~ 2 k 2-18 i Subef feet; of 



The select instruction satisfies a dual purpose. If a 
node x has an n-eomponent, then the instruction select x,n,y 
makes the n-component of x the y-component of the root node 
(so that it can now be "addressed" by further BL ins true- _ 

. , ..... , ..•'-; - . \ ] --x 1 -/;- flO.;. ".' "•■ J?- 3 3t : '. '"'''-■'■ -'■•m ' 

tions) . in this manner a BL procedure may gain access to 
arbitrary nodes ofac.l.s. Ifx has no n-component, then 
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the instruction select x ,n.y generates one first, then, 
makes it the y-component of the root node. This is the 
principal way to construct BL-ob}eets, i.e. by using the 
select instruction to add on components. These two roles of 
the select instruction are depicted in figures 2.2-19 and 
2.2-20, respectively. The root node may or may not have a 
y-component prior to the execution of select x,n,y. if it 
does, then the value is lost unless it was shared. 




L 



Pig. 2.2-19. 1st role of 
select x,n,y 




L 



Fig. 2.2-3(0* 2nd role of 

select x,n,y 



The agEly instruction provides for the activation of BL 
procedures. Let the p-component of the c.l.s. represent the 
BL code for some procedure (i.e. be a procedure structure). 
Then the instruction apply p,x activates this procedure 
in the following manner: First, a new, empty local struc- 
ture is created. The x-component of the c.l.s. is then made 
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the $par-component (parameter linkage) for the new, local 
structure (we refer to the BL-object x as an argument struc- 
ture) . Finally, control is passed to a new site of activ- 
ity. This means that the newly-created local structure be- 
comes the c.l.s. and the old site of activity is made dor- 
mant. The interpreter will now execute instructions frojn 
the procedure p until it is told to return. 

rhe return instruction provides fpr termination of the 
execution of a BL procedure and for return, t£ the galling 
procedure. Upon execution of a return instruction, the 
c.l.s. is deleted. All its components vanish. The parameter 
linkage, since it shares witii the argument structure of 
the invoking procedure's local structure, remains. Control 
is returned to the dormant site of activity for the invoking 
procedure, and its local struct$i* becomes the new c.l.s. 
The invoking procedure resumes from where it left off. 

In order to invoke a procedure f it must be represented 
as a component of the c.l.s. The mo ve instruction makes 
data in the universe available f©*- invoiiitibn as a BL pro- 
cedure. We will not have occasion : tb" J u#e ""tfiii instruction 
here; further details are found in the Appendix. 

The instructions of a Bt procedure are labe led with 
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naturaj numbers; execution of a BL procedure consists of the 
successive execution of its instructions in sequence accord- 
ing to the numbers labeling them. The remaining BL instruc- 
tions provide for changes in the control sequence. Each of 
them has as one of its operands a label 4 which must be a 
natural number labeling some instruction of the procedure 
currently being executed. 

The instruction goto t transfers control to the 
instruction in the current procedure whose label is the nat- 
ural number i. > 

The instruction elem? x,t tests whether the x-com- 
ponent in the c.l.s. is a leaf node ( eleme ntary object), if 
not , control passes to instruction number 1. 

The instruction emp t tv? x,i checks whether the x- 
component of the c.l.s. is an empty leaf node (i.e. no com- 
ponents and no elementary value) . If not empty, control 
transfers to instruction number t. 

The instruction nonempty? x, i performs the same 
test as the corresponding empty? instruction, but control 
passes to £ if the x-component is empty. 

The instruction eg? x,y #J e looks at the x- and y- 
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components of the c.l.s. Both must be leaf nodes, or else 
the effect of this instruction is undefined. These nodes 
are checked to see if they have the same elementary value. 
If the test fails (i.e. their values are not equal) , then 
control passes to I. 

The instruction has? x,m,i checks whether the x- 
component object of the c.l.s. has an m-oomponent . if not , 
control passes to i. 

The instruction same? x.y.i checks whether the x- 
and y-components of the c.l.s. share the same node. If not , 
i.e. they are distinct nodes, control passes to I. 

In all the above conditional instructions, if the 
c.l.s. fails to have a component indicated by some operand, 
then the effect is undefined. 

Other conditional instructions analogous to the above 
ones can be defined (e.g. testing whether one elementary 
value is less than another) . We will have no need here for 
such additional instructions. 

Finally, we discuss one more instruction that will be 
needed. Given a BL object, we will want to be able to 
access each of its components, without knowing beforehand 
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the names of the selectors. The getc instruction serves 
this purpose. Successive executions of the same instruction 
aetc x,i,i extract successive components of the x-compon- 
ent of the c.l.s. by causing the i-component Of the c.l.s. 
to assume as its successive values the selectors on the arcs 
leading from the node x. no component will be extracted 
more than once, and control passes to i when no more com- 
ponents of x remain to be accessed. 

2.3. Programming Conventions for BL 

In this section we introduce a few programming conven- 
tions which will make BL procedures easier to write and un- 
derstand. We can view BL as the machine language for a 
hypothetical computer. Our conventions are then similar to 
the programming features provided by a macro-assembler. 

Although individual instructions in a BL procedure are 

labeled by natural numbers, 
we shall use symbolic labels. 
For example, suppose that x 
and y denote leaf nodes in 
the c.l.s. Then the BL code 
of figure 2.3-1 places the 



eg? 


x , y , no 


const 


'yes' ,ans 


goto 


skip 


no: const 


' no * , ans 


skip: .... 




Fig. 2.3-1. 


Use of 


symbolic labels in BL 
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string value "yes" in the node ans if the values i.-.pf x and y 
are equal, "no" if they aren't. 

The nodes addressed by operands in the BL instructions 
must be direct components of the root node of the cl.s, 
with the select instruction, we can access nodes further 

down in the c . 1 «$?-. For .instance , sup- 
pose we wish to change the value 3 in 
figure 2,5-2 :4htP the ;^ value 4/< Th4.s is 
done by the c pn a jb instruction, but in 
order to access the proper node, we 
must use the select ii*str^tlbn r three 
times. In the, BL cod4 that |per forms 
our task (figure 2.3-r3)-V the reserved 
..selector $tenfp acts as a temp- 
orary variable. By using a 
"dotted pathname" convention 
to refer to appropriate nodes, 
we can abbreviate this BL code 
as the single instruction 
const 4,x.h»d.e < This can be 
viewed as a macro- instruction whose expansion gives the re- 
quired select instructions, jMter»ativeJ.y^ wf can : . ^ look at 




select x,b,$temp 

select $ temp , d , $ temp 

select $ temp, e , $ t emp 

const 4,$temp 



Fig. 2.3-3. BL code 
to access a node 
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this convention ail ext^ndiing "addremaal^ arbitrary 

nodes in the c.l.s. 

Wte will make freo^terit ia#e of a ma^cro^ubstitution cap- 
ability, which is provided by a "*-* convention. If z is a 
leaf node containing some elementary value, then *z denotes 
this elementary value, ipor example-', in the c.l.s. of figure 
2 . 3-a; *s denotes the value 6 . Th* *bbr«yiiiti©n cons t * z , y 
specif i*a the same transition as the' instruction const 6,y 
when the e.l.s. is in this state. In the c.l.s. of : figure 

2.3^-4, the leifcf node with valae .'2 can 
be addressed by any of the : forms x. a, 
x.*z, *y.a, Or *y.*«. while the 
value 2 itself can be idenofced; by any of 
the forms *(x.a), *(x.*z), *(*y f a)., 
or *(*y.*z)i As a JA^d f example, the 
BL fcodife.ijf ,,-fi!p»re.- 2..3^5,jatets 
all the components of the ob- 
ject x to sero. — Sefee -that the 
. leajf node i contains «a . suc- 
cessive values the names of 
the yseiaetors; ':f |om x . ■ Thus 
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loops qetc 


X,i,out 


'■'■teghMk* ' 


0/X.*i 


<tefeo 


out S .... 




Pig. 2.3-5. 





the dottid pathname x.*i refer* to the successive com- 
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ponent nodes of x. 



We now define several macros for BL to denote commonly 
performed functions. The .setl macro (set up local" 
structure) is used to set up rfew components in the c^.l.s^. 
Figure 2.3-6 shows the definition, pf thj> ^s*tl maqroi land 
figure 2.3-7 gives an example of its effect. 







■':■ ' V </::■ 






•setl (xl,...,xn) 


: 


-A~ 
I. = 


•-: t • 


create xl 

• ■ 

create xn 


■ i " • ] ■ ■ ■■ 




Fig i 2.3-6. Rxpan- 
sion of .setl macro 


Fig. 2.3-7. Effect of 
..s«tl Jx,,yJ *,/',-.,. ,' 





The remaining macros we will use deal with linkage be- 
tween BL procedures. We first define a firocedure closure to 
be a BL-object with two components. The $te^-epmponent 
contains BL text of a procedure, and the $env-component con- 
tains references to the global variables named in the pro- 
cedure. (Note that "$» is a legal character i n BL.) 

The .call macro expands into BL code to invoke a pro- 
cedure, m the definition in figure 2.3 T §, the node p must 
be a procedure closure, and al an are selectors 



'£&^' < $^<%^S?*&fr. 



y^fe^iJRSjB^^ 



-42- 

leading to the arguments, which nay be arbitrary BL-objects. 

Figure 2.3-9 gives an ex- 



r 



.edit p, fair, . . . , an) 



create ' $arg v ' ; 

AJadfe >»-i. -$aac^,$g3jek,pi$env' 

link $arg J ,X,^ r l 

■ - 

JlSk ?a.rsr#n,an 

delete $ifcg 



Fig. 2^3-8. Expansion of 
the .call macro 



ample of the invocation of 
a procedure p having a 
single global reference w; 
the procedure p is called 
with arguments x and y . " 
|«h» ^old-^es^-s." i*~the . 
fLocal structure of the in- 
voking prc«je4tare iv 4an# the 
?at» ci.'i.> 'ii the -local 
strucjture_ojf tKft .called procedure fr» fhe ■**«£t*r" "plcttrre 
shows both the old c.l.s. and the new c.l.s. when control is 
passed to the procedure p. 
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Fig. 2.3-9. Bffeet of .call p,'(x,y) 
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The .getp macro ( get parameters) serves to bind the 
formal parameters of a procedure ti©? the actual arguments 
with which ft was invoiced . The . ge^g macro ( get ^loba^s) 
makes the global variables named -•in\% 3pjf0eed|j^ .^csessib|e 
in its body. These two macros are defeated* in figures 
2.3-10 and 2.3-11. 
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select 
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select 


fpar,l 
$par,n 


,xX : 

>xn 


Fig. 2.3 
of the 


-10 . Expansion 
. getp macro 



.getg « ixl, ~» » . r xn| 





select $par . $glob, xn , xn 



Fig. 2,3-11. Expansion 
of the .getg macro 



The first actions a procedure normally performs when 
given control are the retrieval of parameters and global 
variables (using the .getp and .getg macros respective- 
ly) . Figure 2.3-12 is a "continuation" of figure 2.3-9, 
showing both c.l.s.'s after the invoked procedure p executes 
the two macros .getp (u,v) and .getg (w) . 

With the BL programming conventions that have been de- 
fined here, we are now ready to use BL as the language of 
our semantic model. 
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Chapter 3 
STRUCTURES, POINTERS AND SHARING 



3.1. Mini-Languages 

In this chapter we present a series of mini -languages 
which treat the issues of structures, pointers and sharing. 
The progression of mini-languages is hierarchical in that it 
starts from a few basic concepts and proceeds outward by 
extension. Mini-Language is the "kernel" language, iso- 
lating the notions of variables, values and assignment. 
These basic concepts form the core for our domain of dis- 
course. Mini-Language 1 is a direct extension of Mini- 
Language 0, adding to it structured values and the notions 
of construction of structured objects and selection of com- 
ponents from structures. Mini-Language 2 extends Mini- 
Language 1 by including pointers and the two operations of 
building and following pointers. Finally, Mini-Language 3 
treats the idea of sharing of components b#fc**een objects. 
By revising the concept of structured value found in Mini- 
Language 1, the notions relating to pointers are subsumed in 
Mini-Language 3 by notions relating to sharing. 

Each mini-language is treated in a separate section of 
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this chapter. In each section, we first discuss in general 
terms the concepts addressed by the mini-language under con- 
sideration. New terminology is introduced* and we describe 
the relation to previous and/or succeeding mini-languages, 
we then supply a BNP-style syntax together with a descrip- 
tion of the syntactic classes and what they represent. The 
semantics of the mini-language is stated informally, a la 
ALGOL 60, We then formalize the semantics by giving samples 
of rules for translation from the mini-language into the 
base language BL. Each section is concluded by a "movie" 
illustrating the interpretation of the BL program produced 
by the translator from a sample program in the mini-language. 

The final section of this chapter applies these mini- 
languages to the task of describing the data Structuring 
semantics of "real-world" programming languages. The lan- 
guages PAL, QUEST! and SNOBOL4 are used as examples. 

3y2. Wini^a^guage Q — Basics 

Mini-Language (ML-0) is the foundation upon which we 
build our mini-language setup. In introducing the concepts, 
of value, location and assignment, ML-0 serves as a kernel 
for our set of mini-languages. The notions of structures. 
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pointers and sharing will emerge as extensions to ML-0 in 
succeeding mini-languages . 

All our mini-languages, starting with^rO, ; operate ; 
within the conceptual world of yalues stored in locations 
which we call cells . The relationship between a cell and 
the value stored in it is called the contents mapping. A 
cell with no value stored in it is said to be empty and has 
no contents. We are concerned here with the fundamental op- 
eration of assignment, which is used to change the contents 
mapping. In fact, the entire purpose in creating ML-0 was 
to isolate the concept of assignment by placing it in as 
minimal and austere a set of surroundings as possible. This 
notion of assignment will remain u«Gl*ang*d in tfee ^remaining 
mini-languages of this chapteerV The assignment statements 
of these languages will be "consistent" extensions of what 
we define in this section. 

Another important concept we deal yith here is the 
notion of binding. Each identifier in ari c ttf^O *j>rogram is 
associated with a unique and distinct cell. This associa- 
tion is called the birtding of an id*htif*erY The value of 
an identifier will be the contents of the* cefl to which" it 
is bound. (An identifier bound to an empty cell has ho 
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value.) Unlike the contents making, the binding relation 
remains invariant throughout the execution of an ML-0 pro- 
grams This iri^aMance is a |>r©^iHI#- n^t only of* ML*0, but 

of ; -«llJ>tfte «toi**lan§uages' In- thla-th'eMS^- 

Syiilax of Mh-6 

We give a BUF-style syntax for KL-O. informal use is 

made of the ellipsis ("...") to indicate repetition. Two 

syntactic classes are primitive: (integer) denotes integer 

constants, and (identifier) denotes alphanumeric strings 

starting with a letter. 

(program) ::= (assignment) ; ... ; (assignment) 

(assignment) • :«= (destination) ■«-■ (expression) 

(expreseioh) it* <dii*tiitttt±<M) f (tfl#H*4G6±) \ rtll 

(destirtatioh) t*= (identifier), 

(generator) :j* (integer) 

Description 

To understand assignment, we explain the syntactic 
classes relating to values and cell*. A (geiieratoar) is a 
piece of program text denoting a value. All veauee in MLt-0 
are integer*? subsequent mini-lan3R»#»**l include: other types 
of values as well. A (destination^ i%a iM^eee of program^ 
text referring to a cell? (deatiuat4x>n>s in J4L-0 are simply 
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< identifiers, i.e. variable names. The reserved word nil 
will be used to signify empty cells. An (expression) is a 
piece of program text which "yields" a value. The semantic 
description below discusses evaluation of (expressions in 

ML-O. 

An ML-0 (program) is simply a sequence of ( assignment) s, 
each of which consists of a (destination) and an (expression) 
The basic meaning of an (assignment) is to cause the value 
yielded by the (expression) to be stored into the cell re- 
ferred to by the (destination). 

Semantics of ml-0 (informal) 

The notions we have just introduced will now be made 
more precise. We give the semantics associated with each 
significant syntactic class of ML-0 (now as a description in 
English, later more formally via translation into BL) . 

(!) ( Program) s ; The execution of an ML-0 (program) 
consists of two steps. First bind each (identifier) oc- 
curring in the (program) to a distinct, empty cell. Then 
execute all of the (assignment)s sequentially, left to 
right. This rule giving semantics of (program) s will remain 
intact for all the subsequent mini-languages in this chapter. 
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(2) ( assignment) s ; The execution of an ( assignment) 
consists of three steps — 

(i) Identify the cell referred to by the 

<dtestin«bitm> cm *tli* ^•$*«Ha«d »Me of the 
(assignment) (see rule (3) below) . 

(ii) Obtain the value yielded by the (express ion) 
on the right-hand side (see rule (4) below) . 

(iii) Make the value from °ste£ (i£) Uhe new contents 
of the cell from step (i) . 

Thus the effect of executing an (assignment) is a change in 

the contents mapping. This rule, like rule (1) , will govern 

the semantics of the remaining ralni-languages . 

(3) (destination) s and adentifier)s : A (destination) 
in ML-0 is always some (identifier), and refers to the cell 
bound to this (identifier). This binding is determined at 
the beginning of program execution; as we have already said, 
it remains constant throughout execution. 

(4) ( express ion) 8 ; There are three varieties of 
(expression) in ML-0. We describe their semantics in rules 
(5) , (6) and (7) below. 

(5) nil : The special symbol nj t,,! .indicates the absence 
of a value. Any time we are directed to store in some cell 
the value yielded by an (expression) which is nil , this 
means to make the cell empty. All of our mini-languages 
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treat nil in precisely this manner. 

(6) ( destination \a as (expresaion)s * When a 
(destination) occurs as an instance of an ^expression) (in 
ML-O, this means on the right*3iand side of an /assignment)) , 
it yields the value contained in th* cell to which it refers 
(see rule (3) above). If this cell is esfty., the 
(expression) is treated like nil (see rule (5) above). This 
semantic rule (known elsewhere as "dereferencing 11 ) will hoid 
verbatim for all our mini-languages. 

(7) (generators ; A (generator) in ML-O is an 
(integer), which is the decimal representation of some 
integer value. It is this value which is yielded by the 
(generator). 

The above seven rules constitute our informal descrip- 
tion of the semantics of ML-O. 

BL Representation 

The semantic rules we just gave are a bit long-winded 
and imprecise. A rigorous description of the semantics of 
ML-O can be obtained by "translating" these rules into BL 
instruction sequences. Before doing this, we discuss our 
basic conventions for representing mini-language programs in 
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the base language model. To each program in one of our 
mini-languages, there is a single 'local. WttMscstvare. The 
cells used by the program are represented by nodes in the 
local structure. For each identifier occurring in the pro- 
gram, there is a correspondingly named component of the 
local structure which gives its binding. In other words, 
the cell bound to an identifier x will be" the x-eomponent 
node of the local structure. '-Th# contents -of this cell is 
the object of its node. Thus the -RIP translation of any 
program in one of our mini-languages will have a "prologue" 
to bind the identifiers of the program, for example f the 
prologue for an ML-C (program) whose .(identifier >s are x, y 
and z will be the BL macro-instruction .setl : (x»y«a) , ? which 
expands into the sequence create x; create y; create z. 
creating nodes for the cells bound to these <identifier>8. 
Integer values are represented in the base language model by 
elementary objects of type integer. 

As for the translation rules themselves, we give sample 
ML-0 statements ( <assignment)sj and the BL code they are 
translated into. Each example is illustrated by one or two 
"before and after" pictures showing the change the statement 
makes in the local structure. Although our examples are 
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meant to be indicative rather than exhaustive, they should 
be more than sufficient to give the reader a complete pic- 
ture of the rules for translation from ML-0 into BL. 

There are essentially three 2dLx*ds of /assignment >s 
in ML-O: 

(1) (identifier) «- nil 
e.g. x «- nil is translated 
into the BL code 
clear x (fig. 3.2^1) . 
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Fig. 3.2-1. Effect of 
the MkrP (assignment) 
x ♦- nil 



(2) (identifier) *- (integer) 
e.g. y «- 2 is translated into ^he BL : '-j&$#. 
const 2,y (figs. 3.2-2 and 3.2-3). 
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Fig. 3.2-3. Effect of 

sr +< 2 } ? y ..'in ML<-0 



(3) (identifier) «- (identifier) 
e.g. y *- x is translated into the BL code 
•call assignO, (x,y) . This code invokes a BL procedure named 
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assignO, which performs the operation specified by the ML-Q 
(assignment). The definition of the procedure assignO is 
shown in figure 3.2-4, and two examples of the ML -0 
<assign*»ttt> y ♦• x 'a^pic^ife^-'i* i -£$fyiti& 372^-5. 



Ii 1 r T'-rrT 



assignO: 



movt 



.getp 
clear 
const ' 



(u,v) 
u,roov 
v 

*u,v 



Figure 3.2-4. 
Definition of the BL 
procedure assignO 
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Pigs. 3.2-5. Effect of 

} g.gt.psj in .ML-0 



The three translation rules here give us a precise formul- 
ation for the semantics of ML-0 %n terjpa of t|*e semantics of 
the base language model. ' ^ * 

ML-0 Movie 

We conclude this section by giving a sample ML-0 
(program) together with its BL translation. Our example is 
accompanied by a sequence of pictures forming a "movie" to 
illustrate the changing state of the local structure as the 
program is interpreted, statement by statement. 
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ML-0 



BL 





. aetl 


(x,y,z) 


x ♦- 3; 


const 


3,x 


y *- x; 


. call 


assignor ( a?, y) 


X 4- z» 


....:C^JL1.. 


asfignpfcts?**) 


Z 4- 4; 


const 


4#z 


y 4- nil 


clear 


y 








f 






9 - 

4. 

(D 


4 


' X ■ 


HZ 








.t 




■ it-- 


4 


;,.,« 
^ 

I 


y «- 


nil 





3.3. Mi ni-Lancruacre 1 -^ strticttlria 

Mini-Language 1 (ML-1) adds the notion Of data struct 
tures to the foundation pro^aSd fi toy^iL*6v As we Have said 
befor«, a structure i* a dataPoBfect Vhidh consists of indiv- 
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idually accessible component objects. There are two funda- 
mental operations relating directly to this concept of 
structures: (1) construction of a structured object whose 
components will be objects with given values, and ,(2-) selec- 
tion of component objects from a structure. MIi-1 provides 
for these operations while retaining intact the concepts and 
mechanisms of ML-O. In particular, the notions of cells, 
values, contents, binding and assignment -«re- exactly as 
before'. "'".''[ 

:ln addition to the integer^ vajjues found in ML-O, ML-1 

provide^ a new class of structures .! A stfruetur«ra* vitue don- 

sists of a sequence of component values (which may be int- 

e< 3 er , s or structures) , To store away a structured value , we 

require one cell fdr th# structure,; and also separate cells 

to hold the values *>f its componentSs. This requirement is a 

departure frotn ML-O, in which all csslls in use are bound to 

identifiersv component cells must now be handled by some 

kind of free-storage man^fm^.tmGlm$ : q^ y .^i9mll : ..9iilOT- 
cator . 

In ML-1, a cell may as«tume ...successive value* of diff- 
erent types (an integer one Jttoraent and a stru«^ure the next, 
or vice versa) . There are no restrictions on what values 
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may be stored in which cells. There is a need, however, to 
detect references to nonexistent components of a structure. 
Such error-checking will have to be performed by the defin- 
ing interpreter. 

Syntax of MI4-I 

There is a new primitive syntactic class here, namely 
(selector), which denotes alphanumeric strings together with 
integers. 



(program) 

(assignment) 

(expression) 

(destination) 

(selection) 

(generator) 

( construction) 

(field) 

Description 



= (assignment) ; ... ; /'assignment')' 

= (destination) 4- (expression) 

= (destination) | (generator) j nil 

= (identifier) | (selection) 

= (selector) of (expression) 

= (integer) | (construction) 

= [ (field) ; ... ; ffleid) ] 

» (selector) s (expression) 



Structures in ML-1 are sequences of component values. 
Each component in a structure has associated with it a 
(selector). The selection operation gives individual access 
to the components of a structure by using the (selector)s to 
indicate the appropriate components. Thus> for example, the 
(selection) a of x refers to the coaponent of the struc- 
ture x having the (selector) named "a". 
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The notion of (destination) is extended in ML-1 to in- 
clude selections of component objects frost structures, in 
particular, < selection >s may appear on both sides of 
< assignment) s. This allows for selective updating of com- 
ponent 8 of a structure. A (selection) occurs as an instance 
of a (destination) and refers to a coraponent, cell for a 
structure. In this way, ML-1 preserves the ML-0 association 
between (destination )s and cells. 

Also as in ML-O, distinct '( destination )s refer to dis- 
tinct cells, There is no sharing, 0;f data. 

All values in ML-1 are created by instances of 
(generator)s. A (construction) is a special kind of 
( generator ) provided by ML^l ' for building structured values . 
in a (construction), we simply supply (expression) s yield- 
ing values for the components with the associated (selectors) 
Each component name/value pair is called a (field). Thus 
the two kinds of (generator) s, namely ( integer) s and 
(construction) s, produce the two kinds of values in ML-1. 

Semantics of ML-1 tin formal) 

As with ML-O, in order to lend precision to the notions 
we have introduced, we give an informal description of the 
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semantics associated with each significant syntactic class 
of ML-1. 

(1) (procfram)s ; The semantic rule for an ML-^1 (program) 
is identical to rule (1) in the previous section for ML-0 

( program) s. 

(2) {assignment's ; ML-1 ( assignment's work by the same 
principles as in ML-O, but there is a new factor here. Sup- 
pose the value yielded by the (expression) on the right-hand 
side of an (assignment) is some structure. Then new cells 
must be allocated to store the component values of this 
structure. The component cells are said to be subordinate 
to the cell for the structure they belong to (i.e. to the 
cell referred to by the (destination) on the left-hand side 
of the (assignment)) . Moreover, if a cell containing a 
structured value is assigned some new value, then the com- 
ponent cells subordinate to this cell are detached and left 
for the cell allocator to garbage-collect. Structured val- 
ues are copied on assignment, component by component (and 
recursively for structure-valued components) . 

(3) ( destination ) s ; There are two kinds of 
(destination) s in ML-1. ( identifier )s are handled exactly 
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as in rule (3) for ML-0. We now discuss < selection) s . 

(4) (selection)s : A (selection) consists of a 
(selector) and an (expression). The value yielded by the 
(expression) (see rule (5) below) is determined. This 
value must be a structure, or the effect of the 
(selection) is undefined. Furthermore, this structure must 
have some component with the given (selector). Finally, 
this component must be stored in some component cell (which 
was allocated when the structured value was constructed) . 
Then this component cell is the cell referred to 
by the (selection). 

(5) ( expression) s ; With respect to the three kinds of 
( expression) s in ML-1, the occurrence of the indicator nil 
or of a (destination) is treated exactly as in ML-0. As for 
(generator)s, the only aspect we need to explain here is the 
semantic rule for ( construction) s. 

(6) (construction) s : A (construction) consists of a 
sequence of ( field) s, each with a (selector) and an 
(expression). Each (field) represents a component with the 
indicated (selector) and with value yielded by the 
(expression). The rule for interpretation of a (field) 
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consists of three steps — 



:*f!V 



(i) Evaluate its (expression). 

(11) Allocate a new cell and store the value from 

ste P JAL iS 1 , i& n ($)ie «j»«f KqfJL^ «yp^iy». -#n$»ty. if 
step fl") yields no valuefr 

, ., (iii) Associate: .the \*i#wl^ allocated tooniJibtt*ft»%"c%ll 

(and the value it now contains) with the 
(selec^rXo^,^.^^^^ ,. m? ..., , 

The semantic rule for a < cons„tr^c^p|^, f j^_ t^^terpre^ ? 4t» 
/fieldjs sequentially, J.eft t^r^fe, ^ spe^^^^hqy^ 
^ilxesults in a series, of ^pjRtp^ne^ 

P° ne pt ceils and^ acpes^l^e by, 4M&ff9fc!&)f ><$$*>o*^m ^** er 
know it , a. structure,, .. There c %.<$^£ttit^j^£f9ft£ctip9 
on (construction >s: the (selector)s of its (fie^d^mu^ be 
distinct, or else such a (construction) is illegal and has 
undefined effect. 

. we .-.- r epre»eafcT«fe#iic?fcsi#*«-x iftwm.il fey B£«6bf*^ti»«fft * #hich 
the, - root node .• corresponds - toy ekfcaMli* - w*Aftt4*tf ' - the 'Structure 
in, and in which-' the-. ««cNM<«r*Jia*MMMi #tiW^h#i ( »4%ect6t^' 
of the structure and lead into nodes re^es^t^g the corr- 
esponding component cells. ^5 ; ^i^le^^^^lready seen 
is the environment , ( i^cai ^truc|ure^ -ffcftsP mfak&mMMS* - 
program, which is a struct^d va^^jia^^^i^,,^ 
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the variables used in the program. Another example is the 
structure generated by the ( construction) 
f a:l; b: [ c:2; dt nil 1 1 , Whose BL rep- 
resentation is pictured in fig. 3.3*1. 



r*n 



© rh. 
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Pig. 3*3-1. 
BL-object for 
a structure 



A valid ML-1 (destination) corres- 
ponds to a nod© addressable by a com- 
pound pathname. For instance, if the 
structured value of figure 3.3-1 is 
assigned to the (identifier) x, then the cell referred to by 
the (destination) c of b of x will be represented by the 
node x.b.c. 

As with ML-O, a ML-1 (program) whose (identifier)s are 
xl, ... , xn has in its BL translation the prologue 
isetl (xl,...,xn). We now treat translation of Various ML-1 
(assignment) s into BL, illustrating general translation 
techniques that can be readily applied to any Ml-1 State- 
ment. The following cases are representative: 

(1) (identifier) ♦- nil 
and (2) (identifier) «- (integer) 

are both handled exactly as in ML-0 by the respective BL 
primitives clear and const . Note that the action of these 
BL instructions disconnects any subordinate component cells 
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that need to be detached. 



(3) (identifier) <- (identifier) 
e.g. y «- x. This kind of ML-1 (assignment) poses a problem 
in translation when the source (expression) x has a struc- 
tured value. in that case, the structured value for x must 
be copied component by component into y, creating new cells 
as required to hold new components of y. This kind of 

action is illustrated 
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Fig. 3.3-2. Sample effect of 
the ML-1 (assignment) y «- x 
when x has structured value. 



in figure 3.2-2. We 
shall translate the 
(assignment) y ♦- x 
as a call on a BL pro- 
cedure named assignl, 
so the Bt code for the 
statement y «- x will 
be .call assignl, (x,y) . The code for the BL procedure 
assignl is shown in figure 3.3-3. If x is empty or has an 
integer value, then assignl works like the aasignO procedure 
which translates the corresponding mii~0 (assignment) . If x 
has a structured value, then for jSJtch ijpompoaent of x, we 
generate a corresponding component for y (allocating a new 
cell) and call assignl recursively to give this component 
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of y the proper value. Here, the parameter u corresponds 



assignl: 


.getp 


(u,v) 


'■' 


clear 


Y 




nonempty? 


u,out 




elera? 


u , artruc 




const 


*u,v 




return 




f true : 


.getg 


(assignl) 


loop: 


getc 


u,i,out 




.call 


assignl, (u>*i 4 y.*i) 




goto 


loop 


out: 


return. 





Figure 3. 3-3 J Definition of the 
BL procedure assignl. 



to x, and the parameter v corresj»0»ds to y. 

(4) (identifier) ♦- (selection) 
e.g. y «- ta of x. 
The pitfall here is that we 
must check to verify that x 
indeed has a b-compon©nt. 
The following BL code takes 
care of this test: 



has? x,b, error 
.call assignl, (x.b,y) 
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Fig* 3,3-4. Effect of 
y ♦» b of x in ML-1. 
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The label "error" refers to some unspecified place we branch 
to if x has no b-component. 

(5) (selection) «- (identifier) 
e.g.. c of a of y «- x is translated into the BL code 
has? y, a, error 
has? y.a,c, error 
■call assignl, (x,y. a. c) (figure 3.3-5). 

(6) (identifier) *- (construction) 
e.g. y <- [ a: 3; b:nil; c:x ] translates into 
clear y 
const 3,y.a 
clear y.b 
.call assignl, (x,y.c) (figure 3 . 3-6) . 




Fig. 3.3-5. Effect of 
c of a of y «- x 






o b a u, c. 

4 ^ 



Fig. 3.3-6. Effect of 
y *- [ a: 3; b:nil; c.-x ] 



There is a subtle pitfall in these translations. Spec- 
ial care must be taken in translating ( assignment^ in which 
the left-hand side and the right-hand side both refer to 
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cells in the same structure. Suppose, for example, that y 
has the structured value depicted in figure 3.3-7. Trans- 
lating the (assignment) b of y «- y into the BL code 



has? y 
.call as 



,b, error Y 
ssignl, (y,y.b) J 



will not yield the correct re- 



sults of figure 3.3-8. Instead, there would he a nontermin- 
ating sequence of recursive calls of the procedure assignl 
(figure 3.3-9). We must therefore translate the 
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\ Fig. 3.3-9 



(assignment) b of y ♦- y into 

has? y,b, error 

.call assignl, (y,$ temp) 

.call assignl, ($temp,y.b) 

With this translation, the recursion terminates because we 
are not updating- the structure $tfeafr? during the process of 
recursively going through its components. 



For other cases of "overlapping" assignment, we adopt 
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similar translations. For example, we translate the 

(assignment) y «- f a:l; b:y ] into the BL code 

•call assignl, (y,$temp) 

clear y 

const l,y .a 

.call assignl, ($temp,y.b) ; 

and we translate y «- [c:aof y ] into 

has? y , a , error 

clear $temp 

link $temp,q,y.a 

clear y 

-call assignl, ($ temp.gr, y.c) , 

Note that in ML-1, the translator can detect any 
occurrences of these "overlapping" assignments and make the 
according adjustments. 

ML-1 Movie 

As in the previous section, we conclude with a movie 
of a sample ML-1 (program) and its translation into BL. 



ML-1 

x «- 4; 

y *■ [ a:2; b:x? ct nil ]; 



BL , 

.setl (x,y) 

const 4*x 

clear y 

const 2,y.a 

.call assignl, (x,y.b) 

clear y.c 
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x «- a of y; 
a of y 4- 3; 



x ♦- y; 

y ♦- [ l:a Of x? 

2:[ r:nij.; 8:4 ] J; 



s of 2 £f y «- aqfx; 



C Of X 4- X 



*!»<■? y, a, error 

.call assignl, (y.a,x) 

has? y, a, error 

const 3,y.a 

.call assignl, (y,x) 

clear y 

has? x, a, error 

.call assignl, (x. a, y.l) 

clear y,2 

clear y.2.r 

const 4,y.2.s 

hftfr? - v, 2, error 

has? y,2,s, error 

has? x, a, error 

.call assignl, (x.a,y.2.s) 

has? ..x^c, error . 

.call assignl, (x,$temp) 

.call assignl, ($terap,x.c) 






y <- [ a:2;b:x; 
cmil J 
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x «- a of_ y 


a of y ♦- 3 


x ♦- y 
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2: [r:nil; 
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C Of_ X 4- X 













3.4. Mini-Language 2 — Pointers 



Mini-Language 2 (ML-2) extends the concepts we have de- 
veloped and treats the notion of pointers (references) . A 
pointer is a means by which one can indirectly access a cell 
and its contents. As with structures, there are two basic 
operations inherent in the concept of pointers: (1) crea- 
tion of a pointer value which refers to a given cell, and 
(2) accessing the cell a pointer "points" to. We wish to 
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provide for these operations while preserving the concepts 
and mechanisms that have already begirt developed in this 
chapter.. ; 



In ML-2, there is a new class df pointer values. As 
with H£-l, cells can accommodate Successive values of diff- 
erent classes. We will not, however, allow indirect refer- 
ences through values which are not pointers . 

One respect in which the notion of pointer differs from 
previous concepts is that a pointer Value contains infor- 
mation about title ce| l it refers to. Previous concepts of 
value hatf nothing to do wit& cells . we shall, see some of 
the difficulties caused by this extension. 

In this section, we treat ML-2 as an extension of ML-1. 
However, it is not necessary to include structures in order 

to fcaadle'i- 13» near ■ notion •■ ©* pointers^ Qtm tkt&l& : - alterna- 
tively omit" s*«&ctures- £&^9i£**2'-'latiA v4.ew : it ! 'as a direct 
extension to ML-0. ! 

The "boxed" portion of the ML-2 Syntax is that part of 
ML-2 that deals with structured values and the basic oper- 
at ions oh them . 



s-m^- -p$?$&£X$r? 
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(program) 


: := (assignment) ; ... ; ( 


assignment) 


(assignment) 


::= (destination) ♦- (expression) 


(expression) 


::= (destination) | (generator) | nil 


(destination) 


::= (identifier) | (indirect) 


J (selection) 


(indirect) 


::= val (expression) 




(selection) 


: := (selector) of (express 


lion) 




(generator) 


::= (integer) | (pointer) 


| (construction) 


(pointer) 


: := ptr (destination) 




(construction) 


::= [ (field) •>• ... y (field) ] 


(field) 


::= (selector) : ,(e^^es^iQ,n,) ; 



Description 



There are two new syntactic classes in ML-2 . A 
(pointer), consisting of the symbol ptr and a (destination), 
specifies the creation of a pointer value which will refer 
to the same cell as the (destination) . The only way to 
build pointer values in ML-2 is by means of {pointer )s? we 
therefore classify the (pointer) syntaefe'ieally as an in- 
stance of a (generator)* An (indir^ct)r insisting of the 
symbol val and a (pointer-valued) (mopremwim) , is ML-2's 
way of accessing the c«ll referred to by »a pointer value. 
As such, an ( indirect) is a ./-kiahi*^t*'(a«*t^i|4ttit9h) . 

We have already seen all the other ML-2 syntax classes. 
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Semantics of ML-2 (informal) 

All we need to give here are informal semantic rules 
corresponding to the two new syntactic Glasses. All the 
other semantic rules for ML-2 are- identic! to the corres- 
ponding rules for ml~0 or HL-1. 

(1) (pointer)s: This kind of (expression > contains a 

f destination) and yields a ■ pcinfceevaJue- which refers to the 
same cell as thia ^dte*b^tration> . 

(2) ( indirect) s: An (indirect) contains an (expression) 
The value yielded by the (expression) is determined, if it 
isn't a pointer, the (indirect) has undefined value, other- 
wise the (indirect) specifies the cell referred to by this 
pointer value. 

BL Representation 

Beaidiag on a way to represent; pointer values in BL 
presents difficulties, in most ^an\Mmm&mtV systems, point- 
er values are simply the, numeric addressee <ff etellA. How- 
ever, in the, base language^raodeH referencing &f cells is 
symbolic. The most straightforward: app^tmeli to this problem 
is to view a cell's pathname (i.e. sequence of selectors 
from the root node of the current local structure) as its 
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address. A pointer value would then be represented in the 
base language model by an elementary string value encoding 
the pathname of the cell pointed to. Under such a scheme, 
after executing the ML-2 instructions 

x ♦- 3; y ♦- ptr x; z ♦- y; w «- val y 
the environment would appear as in figure 
3.4-1. After the further instructions 

z «- x; val y *- ptr z 
are executed, the environment would then 
appear as in figure 3.4-2. Under such a 
scheme, translation into BL would not be 
difficult. However, this approach breaks 
down in the presence of structures. For 
example, execution of the sequence of ML-2 instructions 

x *- [ a: 2 ] • y ♦- ptr a of x 
would result in y having as value the 
pathname "x.a" (figure 3.4-3). If we 
then execute the ^assignment) x <- 3 , 
x would no longer have an a-component; 
the cell containing the value 2 would 

therefore no longer have the pathname x.a and would hence 
be inaccessible through y. In other words, under this 
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scheme there is no way to provide for retention of cells 
referred to by pointers. The main conceptual weakness of 
this scheme is that the address of a ©ell depends on a par- 
ticular path of access to it. Such a dependence is to be ;) 
avoided ■. 

A second way to refer to a cell is by directly linking 
to it, that is, sharing it. It is imperative that the 
pointer have a separate cell for itself as well as the cell 
it points to. Otherwise, after executing the ML-2 instruc- 
tions x ♦■ 3? y ♦- ptr x we would have a 
situation as pictured in figure 3.4-4 in 
which the (assignment) y «- 2 would err- 
oneously affect x (we want to access x 
through y only by use of the (indirect) 
val y) . TO insure separate cells, we will make a pointer 
value an instance of a structure, where the cell pointed to 
will be the sole component cell. Thus 
the result of executing the instructions 

x «- [ a : 2 ] ; y «- ptr a of x 
will be as in figure 3.4-5, and after the 
further instruction x ♦- 3, we see that 
the cell containing the value 2 is proper- 
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ly retained (figure 3.4-6). Note that we 
have adopted the reserved name "$val" as 
the selector for the single component of 
an ML-2 pointer value under our repre- 
sentation scheme (to avoid clashes with 
the (selector)s of ML-2 structures ) . 

Now that we have settled on a BL representation for 
pointer values, translation of ML-2 into BL is straightfor- 
ward. We only need consider four new cases of (assignment^: 

(1) (identifier) * (pointer) 

e.g. y «- p_tr x is translated into the BL code 

clear y 

link y,$val,x 

(2) (identifier) <- (identifier) 

e.g. y 4- x is translated into the invocation 
-call assign2,(x,y), where the definition.of the BL pro- 
cedure assign2 is shown in figure 3.4-7. The difference 
between assignl and assign2 is that assign2 has additional 
code to handle assignment of pointer values, preventing us 
from attempting to copy the contents of a cell referred to 
by some pointer. An example of the assigning of a pointer 
value is depicted in figure 3.4-8. 
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assign2: .getp 



comp: 



struc : 
loop: 



OUt: 



(U,V) 

V •,.. 



elem? 
Oonst " 

££&*££ 
has? 

link 

return 

.getg 

3*tc 

• call 

goto 

return 



u,$val, struc 
V,$val,U.$val 

(aasign2} 
u,i,out 

aawigna , fut.wi , v. * i ) 



Figure 3,4-7. Definition of the 
BL procedure assign2. 




Fig . 3 Jr 4^€ ., Jl.«fe»ctE <o€ . 
the ML-2 <assi.gn»ent) 
yj .*►■ x, .-. when *• Hsmrf-a.- -, 
pointer valuer 



(3) <identifi(er^ ♦• (indirect) 
e.g. z * vaj. y is translated into the BL code 



i?*r^^|^^,^*|^B^^^^^.^ 
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has? y,$val, error 
•call assign2, (y.$val,z) 

(4) (indirect) «- (expression) 

e.g. val x «- 3 is translated into the BL code 

has? x,$ val, error 
const 3,x.$val 

Using these translation schemes, it is easy to produce 

BL code corresponding to any ML-2 (program). However, the 

presence of rt ^verlap^ing#^lissigniBents can no longer always 

be detected by the translator . For example, in the state 

depicted in figure 3.4-9,, we want the (assignment) 

b of y ♦- val x to result, in the state shown in figure 

3.4-10. The BL code 

has? y,b, error >:,-;• ;' 

has? x,$val,error 

.call assign2, (x.$val# 
$temp) 

.call assign2, ($temp, 
y.b) 

works properly. In ~ 

other words, the' trans- * *-- I 

lator .must produce BIi code to pjerform- extr4 copying whenever 

there is a possibility of overlap. Tfhis i* wruss^ot source 1 of 

inefficiency, since overlap As. probajtfly an infrequent event. 





ML-2 Movie 
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x «- [ a:4; b: niX 1 ; 



Y *" P, tr b of x '> 



vai y f 5; 



MLz?_ fit 

.setl (x,y,z) 
cl e ar x 
const 4,x.a 
clear x.b 
has? x,b, error 
clear y 

:lln% - v^$vttlvx.-t> ■ ' ' 
l ^Mft ,. i-y^syal^egror 
const 5,y.$val 
z «-. [ cty; dt val yj et ptr z Jy &§*£•• y # $val*eirror 

. call a»»ig*if jfy . $v*S , $tei^j) 

.call a#sign2, (y,z.c) 
.call assign2, ($temp, z.d) 
■ link e*e f $viSvsB ,: , 
b &f x <* 6; ■ ■; *MJB? XjlMHWSY 

■ . COtt»t ■4r».'b ■ 

x «- z .call asaign2, (z,x) 



I | H l tT f m i ,11 I I 

A A * 



prologue 




-**y*»iiii it'A|>ii.ii>«ii mil lmyyyi 
4 



a 



a t 4v<4. 



y 4- pt;r b of x 
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o» w $v<4 c <*■ * 




S*.M 




■#■■> [ ety> d; val y? 

e: ■ ptr ,- zj 



il i ,nj i» ,r < |. 




b of x :«^ 6 




3.5. Mini-Language 3 — Sharing 



So far in this chapter, we have progressed through 
three mini-languages in developing our semantic model for 
data structures and pointers. Although ML-2 handles all of 
these concepts, there are some respects in which the design 
we so carefully built up becomes cumbersome and inelegant. 
In this section we shall look at some of the weaknesses of 
ML-2 and see how they reflect a conceptual shortcoming in 
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our design. The mini-language ML-3 is devised to remedy 
these deficiencies. By revising the notion of structures, 
ML-3 becomes not only more powerful and efficient than ML-2, 
but conceptually simpler as well. In fact, the entire ap- 
paratus of pointers that was developed in the previous sec- 
tion is subsumed within the re-definition of structured 
value. 

The main difficulty with ML-2 emerges when we consider 
the way pointer values are represented in the base language 
model. This is admittedly a rather strange way to examine 
the merits of a language, namely in terms of a representa- 
tion decision with respect to a particular semantic model. 
But the base language model is special in that it was spe- 
cifically designed for the purpose of describing the con- 
cepts of sharing which we are studying. So it is perfectly 
valid to use insights provided by this model to aid in de- 
signing mini-languages which deal with data structures and 
sharing. 

In the last section, we chose to represent a pointer 
value in the base language model as a one-component struc- 
ture whose component cell is precisely the cell pointed to. 
In other words, pointer values are instances of structures 



'^^^^^'^^"^^K^Pr 
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whose components share with other data objects. It is this 
much more general concept of shared data objects that con- 
cerns us in this section. The only kind of sharing provided 
in ML-2 is the pointer, which is a structure having exactly 
one component cell, shared with some object. In the course of 
trying to model aspects of real-world programming languages 
in ML-2, this limitation becomes a stumbling block. For 
example, the notion of "tuple in languages like BASEL is that 
of a vector of addresses, i.e. a structure with an arbitrary 
number of components snaring with other objects, in ML-2, 
this can be modeled only as a structure whose components 
are pointers. These components, when represented in the 
base language model, take up an extra level of indirection, 
which becomes a bit clumsy. 

To give a better treatment to this generalized notion 
of sharing, we revise our concept of structure, in ML-2, as 
in ML-1, the notion of structured values as being composed 
of components with (selectors and value* does not directly 
utilize the concept of cells. Cells are part Of only 
pointer values. What we've done in ML-2 is represent 
pointers like structures but use a different set ©f rules to 
manipulate them. This conceptual distinction puts the two 



-82- 

notions — structured values and pointer values — almost at 
odds with each other in ML-2. We include cells in our re- 
vised concept of structured values in ML-3; a« a result of 
this, the need for a separate class of pointer values van- 
ishes . 

A structured value in ML-1 and in ML-2 was a collection 
of components, each consisting of a value and an associated 
(selector). In ML-3, we define a component of a structure 
to now be a (selector) -cell pair, rather than a ( selector )- 
value pair. The value of a structured object is still the 
set of its components. 



(program) 
(assignment) 
( expr ) 

(destination) 
(selection) 
( generator) 
( construction ) 
(field) 
(cell expr) 
(modification) 



:= (assignment) ? ... ; (assignment) 

:= (destination) ♦- (expr) 

:= (destination) | (generator) 
| (modification) f nil 

= (identifier) J (selection) 

= (selector) of (expr) 

■ (integer) | (construction) 

= ( (field) ; ... 7 (fi#l<l) ] 

= (selector) : (cell expr) 

= share (destination) | (expr) 

* (construction) (expr) 



-=; -V^*"-*-" ■* */r*"< ~*v»*&^-+ mil* 
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Description 

The syntactic classes of ML-3 are identical to those of 
ML-1, with two additions. First, there are now two kinds of 
expressions in ML-3: an <expr> yields a value, and a 
(cell expr) yields a cell. The only occurrence of 
(cell expr)s is within the < field) s of a (construction) 
(where there used to be (expr)s in ML-1 and ML-2) . The 
rules for evaluating both kinds of expressions are given 
below. The second addition is a new kind of (expr), namely 
the (modification) which yields structured objects built 
from other structures. All other syntactic Classes are 
exactly as they were in ML-1. 

Semantics of ML-3 (informal 



The semantic rules for (program) s, (assighment)s, 
( destination^, ( identifier) s and ( selection) s are identical 
to the rules given for ML-1. The remaining elements warrant 
some discussion. 

(1) (expr)s: The occurrence of nil or of a 
(destination) as an (expir) is handled just -as in ML-0 and 
ML-1. (generator )s are either (integer )*, which are handled 
as before, or ( construction) s, which ar« described in 
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rule (2) below, (modification)s are discussed in rule (6) 
below. 

(2) ( construction) s : The semantics of (constructions) 
and ( field) s follows directly from the new ML-3 notion of 
structures. A (construction) denotes the value of a struc- 
ture which is generated on the spot. A (construction) con- 
sists of a series of (field)s, each with a (selector) and a 
(cell expr) . Each (field) represents a component consisting 
of this (selector) and the cell yielded by the (cell expr) 
(see rule (3) below) . Finally, the structured value yielded 
by the (construction) is the set of components given by its 
( field) s. We make one restriction on (construction) s: the 
(selector)s of its (field)s must be distinct, or else the 
(construction) is invalid and has undefined effect. 

(3) (cell expr)s ; The two kinds of (cell expr) are 
discussed in rules (4) and (5) below. 

(4) shared (destination) s ; A (cell expr) of the form 
share (destination) yields the cell referred to by the 
(destination). This is the basic source of sharing in ML-3; 
shared ( destination )s are used to build structures having 
components whose cells are already in use. It is this 
facility which subsumes the ML-2 notion of pointers. 
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(5) (expr)s as (cell expr)s : The cell yielded by an 
(expr) occurring as a (cell expr) is a newly-allocated cell 
distinct from all cells in use and containing the value 
yielded by the (expr). Evaluation of a (cell expr) of form 
(expr) is the only way to allocate new cells in ML-3. 

(6) ( modi f ication ) s : A (modification) consists of a 

(construction) and an (expr). The value of the (expr) 

(which we call the modificand ) must be a structure or the 

indicator nil , or else the effect of the (modification) is 

undefined. The value yielded by the (modification) will be 

a newly-generated structure whose components are obtained as 

follows: 

(i) Each component of the modificand whose 
(selector) belongs to no (field) of the 
(construction) will be a component of the 
new structure. 

(ii) For each (field) of the (construction) there 
will be in the new structure a component with 
the same (selector) and as its cell the cell 
yielded by the (cell expr) of the (field). 

Alternatively, we can view each (field) of the (construction) 
as either replacing or appending a component to the modifi- 
cand depending on whether or not its (selector) belongs to 
some component of the modificand. Note that evaluation of a 
(modification) may cause allocation of new cells, but it 
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does not in any way affect" the contents of existing cells. 

li'-f- bo;Ji;^olXB-vXwea & si '^q^s'ilao) 5 b.b pax^it/ooo dqxs' 
Strictly speaking, (modification)* are redundant in ML-3 . 

xi/Ixv ydi pnxnisdnoo bne 9r;j xx e.*M •"•;...? ilx »xo'X:. iv"ii£:b 
If, for example, the (identifier) x has a structured value 
n?^:vi ''Io iiia^j llso';^ <x io noi jxdIsvS . /xqxa») srix vo Jjxb.I<ixv 
with two components whose (selector)* are a and b, then the 

.'£-JI-'i a.' 1 . sII-30 7/3fi sisooilc oil y'-sw vino 9fi"3. ex 'i^;,j' 
(modification) [b:3; c; share y] x will yield the same value 

as^ the %m^l^h<iemy- ^U jmxrj * a ^^r^ferff4r ^rtf vl . 

''xqxe 5 ) s >ffx io sul&v srfT .''xqxo^ nt oris 'Xxxi:Joi'XxxnX;:>' 
BL Representation 

9.nj -.to s-it/JoxrxxB s ecf jai/si (o£§2iiiiJ?£SI -^ -^ eo 9W rti>ixfw' 

We represent a structured, value i» r ML-3 by. a BLrobriect 

xi (aoi:T&Dilibom) erfx lo .iosxia 3ffJ sal" io .fin •TOT^i'OTrr 

whose arcs lead into the nodes for the component cells, and 

:xi LQ-w xnoi.:l6Diiibora> srixi vc i—nxssxv eJlfiV sirr TTT^fTn 5¥^m 

are labeled with the corresponding < selector}*. This r is 

ae bsnixs-lcJo axs sins>noqmoo saorfv ^fu^ou'TjBlSaiEisaSp-^i^^n c*> 

straightforward, simple and clean. 

w « ^mfaBtetmrXBmikm Umnl£mtkmmc0&mii»t3.{ assignment ) s 

mto E^vt ■>■-;. Xaenoqniox ^ :„•: JX x : .o, i ..;;oix:':tenox A 



rT^;-.o 



....... :,- W-i. 



and, .. .^^^dj^ifiej:)^ £^fce#etr)e»Ixx ; 

. \ o.A.o.X'l) 3fiJ to , xq;<_« ix^o; -.'-.:'• >u .',3liia'i\' 
are both handled as in ML-0 and ML-1. 

fiCj.Jor::;- : 'f'''-:.x xrB "* o . \.h ! ' ^ ' :1 s ' r 'xs vrei" xf-o sw t vlevi xxftx-sx i/\ 

(3) (identifier) ♦- (identifier) ' 

"■ ';.■;■•. :"X-+ ,;•,-*■ ;t ; f grjO-H ' :■ X ? V' "* lYTV^CKft:, '"" ' "XX' ■£-:..! -"X- ■" ".':';'i:li3 X.r. 

code is the sarne^ as^ for^ the , procedure : assic^. f or r emp^, $ndC 
integer values of the source (identifier) x, except for the 
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the presence of the same? x,y,out test which makes sure 
the (assignment) is nontrivial (otherwise the clear in- 
struction would destroy the value we want to keep) . If x 
has a structured value, then y will get the same structured 
value. This means, by the new definition of structured 
value, that the components of y will now share with the com- 
ponents of x (figure 3.5-2). In executing any (assignment), 



assign3: 


.getp (u,y) 




same? u.v.out 




clear v 




nonempty? u.out 




elem? u, struc 




const *u,v 




return 


struc: 


getc u.i.out 




link v.*i.u.*i 




goto struc 


out : 


£ et ujn 


Pig. 3.5-1. Definition 
of the BL procedure 
assign3 






T 







^4 ■ ■ 



k 



ta 







FigV ^3%5-2v : Effect of 
the ML-3 (assignment) 
y *- x wfcetv x has a 
structured value 



that this is a vast gain in 
and ML-2. The "meaning" of 
differs between ML-1 and ML- 



the contents of exactly 
one cell will be copied. 
Component cells are now 
shared, not copied. Note 
efficiency for ML-3 over ML-1 
the (assignment) y «- x, then, 
3. For example, after executing 
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the instructions x 4- [ a: 3; b:4 ] 7 y *• xy a o|y f 5, 



"iOf 



'''.'-> fii a < 



then the expression a ojf x will yield the value 3 in ML-1 
(and ML-2) , but -will evaluate to 5 in ML-3. 

(4) (identifier) «- {selection) ( 

e.g. y *- b of. x is translated i^to^the^BL^-^pde^.,-.. 

has? x,b, error 
.call assign^, (x.b,y) 

(5) (seiectiojri) *- \ identifier) 

e.g. a f :#^¥ ** & is krafaislated inteGh-tSte? Btgeaade 

P>> ■XI^ifi^i|3^'>:.'i*-"1 < construction^ -. :>;-;.:;; 
e.g. y ■■■+ i^ftx; dsb g>f *; e; share z 3 is^rainslated into 



has? x, by error 
.call assign3, (x.b, 



clear y 



.call assign3, (x,y.c) 



cill a4sigft3^ $&«#,' 



link y,e,2 



uiirir ii iintiiii 



y 



¥ % 



r ; ; :t I e 3. ' fl ' J ' M''" — 7 




^f"*BWWfv 






"T 








Pig. 3.'5-3T' "81fec€ Of 

y *- {esx; d:b of X; e; share z] 
■ .'. in M£*3 -:cb-'k 



Note that overlapping < assignment )s pose no^ problem at all 
for statements of types (4) and (5) . This is due to the 
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fact that component cells of a structure are np longer 
copied on assignment. However, we do need the use of temp- 
oraries in (assignments involving < construction) s, for 
instance, to take care of the case when y shares with 
b of x before executing the (assignment) in example (6) 
above. 

Finally, we note that pointers in ML**3 have been sub- 
sumed in ML-3. In place of the ML-2 ptr (destination) 
we can write the ML-3 (construction) '[val: share (destination)], 
and wherever ML-2 uses val (expr), ML-3 substitutes 
val of (expr) . 

ML-3 Movie 



ML-3 



x «- f c:3; d: nil ] ; 



z'4- [ a:4; b: [ q:c of; xj 

r:nil ] ]; 



,setl (fx,y,z) 

clear x 

cohst 3,x.c 

cllear x.d 

hay? x,<i, error 

.ckll awsign3, (XiC,$terap) 

clear z 

const 4,z.a 

cl ear z.b 

.call assigns, ($temp,z.b.q) 

clear z.b.r 
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ML-3 



BL 



y *- [ p : share x ] ? 



p s>JL y *■ y ; 



y «- b o_f z; 



x ♦- [ b:5 ] z; 



z «- [ c: share q of y ] z; 



[ a:b of z; c ; share z ] x; 



clear y 

link y,p,x 

has? y,p, error 

.call assign3, (y,y.p) 

has? z,b, error 

.call assign3, (z.b,y) 

.call assign3, (z,x) 

const 5,x.b 

has? y,q, error 

link z,c,y.q 

has? z,b, error 

.call assign3, (z.b, $temp) 

.call assign3, (x,y) 

.call assign3, ($temp,y.a) 

link y,c,z 





c d 



& 



4 






& 



z «- [a: 4; 

b: [q:c qJ x; 
r;nil] ] 
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* 


1 1 1 

J* 


y «- to: share xl 




p o|,y> y~| 





[b:5] z 




q Of y]z I 

■ - ' ..[■ .. 'i 1 ' .,.!..: . -.,. ' .1.1 . 




$r.*^ fasfe of; z; 
c ; share z]x 



3.6. Discussion and Examples 



In this chapter we have built up a hierarchy of mini- 
languages, culminating in ML-3. We now relate this develop- 
ment to the main issues that were raised in Chapter 1. A 
major concern with respect to a given M real*wojMd" program- 
ming language is the effect of its assignment operation on 
an environment containing structured data iotojeots. We know 
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that executing an assignment statement of the form X := e 
will result in the identifier; X having the value associated 
with the expression e* What is uncertain is th# effect ■>&£' 
such an assignment upon the sharing relationships among the 
various ceils in the environment. Variation* in sharing 
properties can in general induce differences in the effect 
of subsequent assignments. 

We give an example adapted £• rqsa [Bur 6Q| . The or*ly 

** ■■■■<■. - ■ ' ! \ i .:.. *- t ,■ r ? ? ' I *■ ^v. ^ : ' 

data structures in the : 'ei^i*0|ime^t /^.ll' be &lp&*liik* lists 

with* two components selected by the respective selectors 
head and tail . Bur stall compares analogous program* in two 
languages* list-Algol, which commnes ALGOL 60 assignment 
with structures essentially equivalent to LISP lists, and 
ISWIM ("If you See What X Mean") , which is based on the same 
functional lambda-Calculus notions as LISP. In both lan- 
guages, the two-argument function cons returns a list whose 
-head -is the -first argument and whose .tail-4#' : th* second argu- 
ment; the functions head and tail select the components : from 
a list. Burstall^a two ^programs are jshown *£w figure 3.6-1. 
Program A, ; ^t#e are told, prints 3 while program B prints 1 
"since :it.;doe»rnot;.50»ter for the side***ffect on y of 'the 
assignment to x." This explanation gives little insight 



» . '=(£•(*?* sSeSfcST '!■ 
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into why there should be such a difference in the first 
place. The obvious distinction between the tare programs 



Program. A: List-Algol 



Program B : ISWIM 



bec^in list x,y; 
x := CONS (1, nil ) ; 
y := CONS (2, x) ; 
HEAD(x) s*= 3; 

print (head (tail (y ) ) ) 
end 



print l e t x=undef ajid y =undef ; 
let x = coi| S(l, nil) ; 
let y = con;s (2,x) ; 

■ ,.'.■'£$& M * cOgjg (3; ta^l. (x)) ; 
i r«^3f heajd( taiK V)) 



Fig. 3.6-1. Two sample programs with different effects. 



lies in line 4. iswim, being i. functional applicative lan- 
guage, has no direct counterpart to the List-Algol component 
update statement HEAD(x) s « 3. But this is not the root of 
the semantic difference between the two programs. Bur stall 
neglects to say that even if we change line 4 in Program A 
to x := CONS (3, TAIL (x) ) , Program A will still print 3. 

The source of the trouble lies in a subtle difference 
between the cons functions in the two languages . we can 
pinpoint the distinction by translating fcoth programs into 
ML-3. Line 2 in both programs ca4 be translated into 

. ; i ".*■♦'. '':•:.■■: 

x ♦- [ headjl; tail; nil ], with the resulting environment as 
in figure 3.6-2. Line 3 in Program A is? *«<fviivalent to the 
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ML~3 statement y v f- heads 2t ■ tail rs^awre ^ij^i<wh&le line 3; 
in Program B is equivalent to y V [ head: 2; tail:x ]. The 
respective results are shown in figures 3.6-3 and 3.6-4. 



* 




1 


i" ■ 

> 


k * 


■■'..■■■'•■"* 


© 


■ 


Fig. J. 


6-2 . 


State 


after 


''line. 


2.: :...:,, 



JL 






Vw«*4*il 



tat %*t 



fig. 3". 6-3". 
After line 3, 
.. Pr^gf am A is .. 

iA»fTi nram*it«i- ■ " itfitniir-iiiTrlrtlB'itrri nt'-itr 




Pig. 3.6-4. 
After line 3, 
Program B. 



Finally, the revised line 4 for Program A, which reads 
x i.« C0NS(3,TAII,(x)) , is equivalent to the ML- 3 statement 
x «- I head: 3; tail: share tail ojg x J, while line 4 of Pro- 
gram B is equivalent to x 4- f head: 3; tail: tail o£ x ] . 
The respective results are shown in figures 3.6-5 and 3.6-6. 




Fig *, S *<fc»S*. ■ -... ftttour -new ■ 
line 4, Program A. 

--■ -"-■■■■■■■■-■■■■■■ ^-^...-..v:...,^.^...,,.^.^,,..- ■■ , ..-. . r | f . 



— r — " t~ 

> a 

4> ..$\\^vfiul.i 



«£.' 3.6^6. After 1 
line 4, Program b 
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We can see that the ML-3 expression head oj? tail of y 
yields 3 in figure 3.6-5 and 1 in figure 3. $-6. 

The difference between the two cons functions in Bur- 
stall's two languages should now be clear. If an argument 
to cons. is a constant or nil , both languages specify allo- 
cation of a new cell to contain the argument value. But if 
an argument is some identifier, the Lisp-Algol CONS yields 
for the corresponding component the argument's location , 
while the I SWIM ecj&s yields the argument's value . This 
property of the ISWIM cons function is n%% explicitly stated 
in Landin's descriptions ofJSWIM tLan,^, Lam *5, Lan 66a]. 
In fact, the only place from which this property could be 
readily ascertained was in Bursters st*t^men,t that Program 
B prints the value 1. The ML- 3 code in to which we, trans- 
lated the statements of the two programs wa* datiarmined only 
from the stated results of those programs. What, is to be 
concluded from this is not that Landin waa sloppy or vague 
in his language design and definition, but rather that the 
language definition methods which are so widely, used make it 
extremely difficult to extract some of the properties of 
significant practical importance/ In other words, a lan- 
guage which features data structure* wil £, be better under^ 
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stood and better specified if it defines these facilities in 
some manner which makes clear the specific sharing relation- 
ships among locations. 

In the remainder of this section we shall use our mini- 
languages to talk about the data structuring facilities and 
mechanisms of several additional programming languages. 

PAL 

The language PAL [Ev 70] supports only one kind of data 
structure: the tuple . A tuple is a structure whose selec- 
tors are consecutive integers starting with 1. As with 
ML-3, the cell in which a component of a tuple is stored is 
considered an integral part of the value of the tuple. The 
PAL expression 4,5,6 specifies the construction of a tuple 
whose components; have the respective values 4,5, and 6; as 
such, it is equivalent to the ML-3 (construction) 
[ 1 : 4; 2:5; 3:6 ]. Selection in PAL is expressed by juxta^ 
position; if the tuple value 4,5,6 is assigned to the var- 
iable x, then the PAL expression x 2 evaluates to 5 (it 
selects the second component) . This expression corresponds 
to the ML-3 (selection) 2 of x. The correspondences we 
have established are summarized in figure 3.6-7. 



p« 



^^■^^w*^^&^ " 
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The concepts of value of a tuple in PAL and value of a 
structure in ML-3 are very close, and we might expect simi- 
lar assignments to behave similarly. This is indeed the 
case, as figure 3.6-8 confirms. 




X ;= 4,5,6; 
y := x 2 



PAL 



ML-3 



x«-[l:4;2:5;3:6); 

y 4- H~o£ '>c"~ 



Pig. 3.6-7. Construction 
and selection , in PAL. 



T 



I ' ""| T 1 



i a 




• €L\'i<>. < t!>.'. 



x •:= 7,8; 
y : =* x 




x«-[l:7;2:8]; 

y ♦• x 



v Fig|. 3.6-8. Value of 

L, a^buble in PAL 



.2K£~! 



PAL has a semantic rule that components of a tuple 
share with the items in the list expression that constructs 
it; an example of this rule is shown in figure 3.6-9. This 
sharing can be blocked using the PAL un share operator ("$") . 
Figure 3.6.-10 gives an example of this. 









®© 6 



X := 5,6; 
y :* X,7 



PAL 



" y 't^gff ' y i 'g 



ft*L-3 



ri:5; 2:6]; 

fl; share x; 
2:7] 



Fig. 3.6-9. Shading in 
PAL tuple construction. 



"My i ji f |i . i - m i. 



i -W'Vu.ji 


x, t* 5,6; ] PAL 
^■■Y--$x,7 




|ml-3 


x «. [1:5; 2: 6] ; 
y ♦- fl:x;2:7] 


"#fg/ 1 $^-*£& > . p '¥iBcktng 'of 
sharing in PAL . 
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We discuss one more feature of PAL: the aug function. 
If t is an n-tuple (i.e. tuple with selectors 1, 2,...,n) and 
e is any expression, then the PAL expression Jt aug e 
denotes an (n+1) -tuple whose first n components share with 
the components of _t, and whose (n+l)-st component shares 
with e. Examples are shown in figures 3.6.11 and 3.6.12. 



T-*1- 



ilk 



x := nil aug 3? 
y := nil aug x 



PAL 



ML-3 



x *- [1:3] nil ; 

y 4- ri: share x] nil 



Pig. 3.6-11. Example of the 
use of the PAL function aug 

■-.,.. . ..... . ;. , .■ ' MIPMMI 



4 1 1 A i '. t 



x := (7,8) ,9; 
Z := 5,6; 
y :== x aug z 



-EM*. 



M L-3 



x *- [1: [1:7;2:8];2:9]; 
Z *- ri:5;2:6]; 
y ♦- [3: share yj x 



Fig. 3.6-12. Another example of aug in PAL 



The above features illustrate nearly all of PAL's data 
structuring capabilities, and they are easily expressed in ML-3 
Even though the data-structure facilities of PAL bear a 
strong resemblance to ML-3, we have given a demonstration of 
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a full-scale, real-world programming language whose data 
structuring mechanisms have been successfully treated within 
our model. We disauss two more languages. 

The language QUEST [Fenn 73 J provides data' structures 
called lists that appear very much like PAL's tuples (see 
figure 3.6-13) , However, the defiaitiofi of assignment in 

QV&&T treat**' lists as 



1 


x «- 3,4; 
y f x(2| 


QUEST 


1 ■ ' ■ ► 
X 1 

HI 
® 4 


x := 3,4? 

y ;* x 2 ; 


PAL 


X *- [l:'3';2:4]r 
Y *■ 2 of x 


ML-3 


Pig. 3.6-13. Lists in QUEST. 



special cases for which 
special rules apply • 
Th4« reduces > ©sseo- 
tially , ,to a treatment 
of lists in the way 
ML-1 treats structures, compontut; values are copied on 
assignment rather than shared. Figure 3»6rl4 presents an 
example. Note that componentwise copying is coded in ML-3 



x 



4- 






I X i x i 




6)n)©G 






X «- 6,7? 
y *■ x: 

z ♦- 5,x 



X*-ri:6?2:7J? IML-I 

Y *■ x ' 
z*-fl:5?2:X] 



ML-3 



QUEST x ♦- [1:6?2:7]? 

t r -m: fit nil ; 2 mil l ? 
1 Of y,«- 1 o£ x? 

2>:io | fy?:.4-^2..: of x? ■ 
z*-[l:5? 

>2 1 tit nil ? 2 mil ] J? 
1 of 2 of z «- 1 of x? 



' ■ ■■ '! ■'■' — t~ — »^ — ■ ' ' " " ■ mm' ■-' < »>» ■- i 1 ■ n ,m 

Fig. 3.6-14. Copying of components in QUEST assignment 
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by repeated component updates , reflecting a lack of effi- 
ciency. QUEST assignments, unlike their counterparts in PAL, 
cannot be directly translated into ML-3 without knowing run- 
time values (i.e. exactly what components a structured vf«lue 
possesses at any given time, so :tfce^.ftaji b© individually up- 
dated) . 

Like JH>«2, QUEST handles sharing^ entirely by means of 
pointers {called references) . 




X *- 3; {QUEST 
v ** ref xy 

•■.P 5 ■■■'•'■♦• at V 

'-k ■- 3 1 jvSL-2 

y *■ tttor ■•'*?'" 
fc ♦- val y 



F&g«. .. 3. 8 §-3.5 » .- ■ References 
in QUEST. 



Theiar use *» illustrated in 
figuire 3.***15. There is no 
appreciable^ di*i ejtenoe be-' 
tween the behavi© r 6f these 

pointers aftd those in ML-2 . 

Trans la tion into ML-- 3 would s 

be trivially easy. 

For the interested reader, the paper on QUEST [Fenn 73?] 

specifies a way to express general Mk-3rlike structures in - 

i .... 

QUEST using lists and references. QUBSt functions cons , car 
and ed» are defined, and it is claimed that they, simulate 
their LIS* counterparts. The simulation irecpaires an extra 
level of indirection throughout, a major inefficiency (fig.; 
3.6-16). Thus we see that using our mini-languages, we have 
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not only able to illustrate the data structuring semantics 
of QUEST, but we have also perceived a shortcoming in the 
design of QUEST: liXeML-2^ QUEST falls to recognize the 
fundamental significance pf the concept of shading. 



i. a \ i i 



x <- cons (4, nil) ; 
y 4- consf5,x) 



{ QUEST 



tempi <■ " nil ? |ml-2 

temp2 '4- tl:4?2:£tr. tempi] ? 
x ♦" ptr temp2; 
temp3 ♦• fl;5;2; ptr val xl ? 



X4- ry^l : [1 : 4 ? 2 : [val ij^}} ) ; 
y«- [Val: [ITS; 



JL ,. ,-, ., 






Fig. 3.6-16. QUEST simulation of LISP cons 



■ » ' " ■ ■ i - 



SNOBOL4 

In the language SN0B0L4 [$ris 71], one, ffnds data 
structures called "programmer-defined , dajta, types." An in- 
vocation of the function DATA causes selecjtpi: a^nd construc- 
tor functions to be defined, For .exajp^e, „.£&• invocation 
DATA ( • COMPLEX (R, I ) ' ) defines the constructor function 
COMPLEX and the associated selector functions R and I, 
setting up the correspondence depicted in figure 3.6-17. 
Beyond this aspect, in which these SNOBOL structures behave 
exactly as do all the structures we have seen in other 
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languages, the sharing relationships need to be considered. 



■■■i ,*! ' " .-* r " f " "'7 3 '"f" f \* s. 



I**"™ - *"'**'"""—*" 



A » £(C) 

R(C) = 3 ■■•■*' l "" 



'SM'JU 



-Jia.:^U» ■rB.aJu u; 



MlOBQL 






r ig . j . o-i. / . £ . Tr ..—^^ 
strvcturei-fn SNOHEj; 



c ♦- [ r:l; i:2 ] ; 



1&Z3 



(x t <#££>«" . 3 



03 



■■"Ti-fc., 



•; iqrnea 



i, 



r-.*3<" 



But Semantic rules whi^^^M»l«ttbis*e^on •tK^ properties 



are ab't- ^" ; l&^i&tf&' ! f ri^ are a few 

examples. A* yithlJSHlk.. careful exswi nation of the exam- 
pies is requixed - to ff eatiee a conilstiaFanJ unambiguous 
ML-3 representation for the data structuring facilities of 
SN0B0I.4. Sows detective work is needed here as well: each 
of t^e' i tw6' r bt^kS"iGris ^/^rir^W'p^ov^d^^nSff^ient ■ 



information W «iaW s'uWia^'a^teratnatibn," bu#' using' 'both : 
together /' ; e i hou^ic:luW ; c^W'g^ resolve 'possible'. 
ambiguities. An example is shown in figure 3.6-18. 

■..' .Pi; i'.rt 10,1 "DiJ'"3. -i 3 fTOO 9.cflt r !; i<< .b.! tb> ' ( ' f 1 X R i XSuf-iMOb' /\ r j"/s.;"l 

The translation into ML-3 may be straightforward, but a 
number of other possible translations which would result in 

different sharing properties were ruled out only after 

-. -;;:■:. ^ ^':iij'.j-30iis JOfiOWo saarf;* rfoiriv ax , jooq&s sirfrt bnovr:.; - : 
painstaking examination of the examples in both books. 

.-oilJo nx ns^e sivfirf ©v astttj.loc/id'R srfri Iff. ob as v.l-.':>/j/9 " 
Surely a discussion of sharing in these books could have 
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shed much-needed light on the semantics of data structures 
in SNOBOL4. 



1 * . 


; i3kT& p l toE r ( i vxdjEVii^<5 J ' ) f SNO^Ot 


f— ' — r i '■■■'■ ■■ . 
¥«Jv» UA *»l»w >i*k 


P = NODE (5,) 

Q « NODE (6,) ' 

LINK(Q) = P 


' p «- T value- 5y link-nil ] ; ' VML-3 
q <- [ value: 6; link: nil ]; 
link df q <- p 




Fig. ,3 .6-18 . -d. Sharing, pi^j^'txe^; in S^obol , ■ \ 



Completeness ..- ■ >..;., -^f- -.-o i'i->nc^-: a ■■+■'■ ■:> •:••■■;- 

In this chapter, we defined a' ier^es" 1 "©!" mfhi^Languages 
and used' "them to »©%el data ^tru^u^in^^cill^ieW'xii three 
representative prc-grainmlng languages^- in iB^pbrtant question 
to ask is how "6«^i^e ;; cm^'i^ii^'ii. nn Iii other words, how 
thoroughly have we covered t^ 1 ^pfd^^hto' v 'to''d(i^ba structures 
found in these three langUagesT b£' first glance, our treat- 
ment seeps rather incxampleta because «£-theidiai^tedt? express- 
ive power of the; miai-lan^uagaui mm<^i*m&, >&m»&JimBt~-0£- -the- 
features not included in :,&aat mini.- languages are independent 
of the notion* of. drta?s£attclmreiK)^ the way 

such features are* defined in an a«rtas«ul ^flrog«i*Hndtng language 
has no hearings on how the lanapiagerf approaches ecwlBepfeii of ; 
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data structures. The fact that our mini-languages lack 
character strings and conditional expressions, for instance, 
does not reflect on their completeness for describing data 
structures. 

In PAIi, there are only two notions we have not covered 
which have a direct bearing on data structures. First, arb- 
itrary ihtege^eMraiua* >mcp&mBtafm ©ah b» uaed to select com- 
ponents from a tuple. For example, the selection x n re- 
fers to the component of the tuple x whose selector is the 
va^ue of the variable n. This,, cannot be tJfapalated into our 
mini-languages, which allow only; -j^SSMS^ <setlfector>s (the 
ML-3 ( selection) n o£ x would look for a component with 
selector "it".).- The second iuicover#4 femtj^re in PAL is the 
built-in function Order, which when applied to a tuple 
yields the number of components in the,, tuple*. 

Neither of these two notions earn* torn expressed in Our 
mini-languages, bmt it was not our goal to^ be able to do 
so. .For .these two data structu£dmg- ; features, the semantic, 
issues are well understood? we don't really nseed t© treat 
them in our mini-languages. Extending the minimi anguages to 
handle extra notions like these would only serve to ruin the 
syntactic and semantic simplicity of the mini-language 
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approach. 

In QUEST, the only data-structuring features we did not 
treat are the use of expressions to select components from a 
list, and several built-in function©,; that Operate on lists. 
As with PAL, we feel that the issues rai#ed here are outside 
the area of our main concern. 

With SH0B0L4, we completely neglected the area of 
arrays. Although arrays are highly relevant to the issues 
we are interested in, they ^present some difficult problems 
for whose solutions additional mechanisms are needed. We 
discuss some of these problems in chapter 5. 

The three languages covered in this section are all 
"typeless" languages in the sense that there are "no dec- 
larations associating identifiers with particular data 
types, in the next chapter, we deal with "typed" languages 
and some new semantic issues they introduce. 
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Chapter 4 
DA1A TYPES AND TYPEGHBCKING 

4.1. Whv Vi e Want a T\g^ SysfcajB -■•••.-••. 

■""' in 'this 'chapter' we" will add a hew ^fyGet' to-' "the* design 
of our previous mini-languages, "^eft&idijf th'e ML-3 
(assignment) y -**».< Mfl^kfe dimft^/tHafc t8»D*ffi®ft£eht» of the 
cell f or x ■•■• be ■■■&%*&*& into ther-ceU^iear yi-aWBf .iferans-iiated- 
.this-, :<-«8fti.gBW**ifc'>. fate* an toswofSrtMWm- af 1Sief'«I»#is©c«attre'- 
as»ign2 {defined &ftcfc-.iEit fig^3u5*-ill « . -avofey-'-ilEtaw tiriw pro- 
cedure is called, i3**re is a- <«|NdGti£«>ai«t: <Hb tastes per-- . : 
formed to check whether the cell for the first par auaster 
(which corresponds to x) contains an integer or a. structure. 
The set of £L instructions chosen to perforio the assignment 
operation depends on the result of these tests f In Prac- 
tice, however, a programmer will usually know in advance 
whether the identifier x will take on integer or structured 
values. This knowledge makes these runtime type tests in 
assign3 superfluous. We would like some way of telling the 
translator not to make such tests where they are not needed. 

The technique of static typecheckinq achieves these 
goals, its basic idea is to partition the attt of values 
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into convenient subsets called types . The translator can be 
informed of the programmer's intentions of keeping values 
only of a certain type in some given cell. With this know- 
ledge, redundant runtime type tests can be eliminated. But 
it is still necessary to prevent type errors. For example, 
suppose we tell the translator that the variable x will take 
on only structured values. Each time we access the value of 
x, the BL code produced by the translator will fetch the 
components of x. If we somehow place an integer value in 
the cell bound to x, then during execution the interpreter 
would attempt to extract components where there are none, 
yielding undefined, probably erroneous results. To prevent 
such type errors from occurring, we would like to have the 
translator test each (assignment) to make sure it couldn't 
specify the placing of a value of one type into a cell in- 
tended to hold values of another type. Any ^program) con- 
taining (assignments which fail this test i« invalid; the 
translator will notify the user of such an, erpor in the same 
way that it flags syntactically erroneous (programs. 

In testing (assignment)* for validity, it will be use- 
ful for the translator to know for each (destination) the 
type of values intended tb be stored in the associated cell. 
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This criterion can help us decide how to partition the ML-3 
values into types. If we divide values into just two types, 
integers and structures, then the above criterion is not al- 
ways satisfied. Suppose the (identifier) x is specified as 
assuming only structured values. Then the values yielded by 
both of the (express ion) s [ a:3* bs4 ] and 
[ as3t b: t c?5; dt6 ] ] can be stored in the cell bound to 
x, but we cannot say anything about the type of the 
(destination) b of x. In one case it has an integer value; 
in the other case, a structure. Thus finer type 
classifications are called for. We will want to ascertain 
from the type of a structured value what components it has 
and the type of each component. Such a type system is the 
basis for our next mini-language. 

4.2. Mini-ftancmage 4 — stati c T^xtohegKAag 

Mini-Language 4 (ML-4) adds the no%ions o£ data types 
and statics ; typisliheeTcirig to the cton^epts w¥ developed in the 
previous <#fap<ei?. specifically, it is ah extension to ML-3, 
associating to every (expression) and to every cell a par- 
ticular data type. For our purposes* w# cpiukider data types 
as sets of values. The set of integers is an ML-4 data '-- 
type. Further, the set of all structured values with a 
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given set of component ( selector) s such that the type of the 
component associated to each specific (serlecfeor:) is -given 
also is an ML-4 type, with this collective*! j®f data types, 
if we associate a type to each (identifier) mentioned in a 
(program), then we shall be able to det.er^ine; sfche type asso- 
ciated with each cell referred to in the (program). More- 
over, for any particular data type, one can determine whether 
the value yielded by a given (expression) belongs to this type. 

Syntax of ML-4 

The rules here govern the syntax of that part of ML-4 
which is not found in ML-3 (namely the type system) . We in- 
troduce the new primitive syntactic class (typename) to de- 
note the set df Underlined alphanumeric springs beginning 
with a letter. The distinguished (typehamey int has partic- 
ular significance, which will be diseussebt'liSeiow. 

:= (prelude) 7 (assignment:) 7 ...7 (assignment) 
:= (defn) ;.., 7 (defa) 7 (deel) 7...; (dec!) 

(typename) = (s true type) 
:= [ (comp deel) 7...? (comp deel) ] 
;= (typename) (selector) 
i». (typename) (ideafcifier) t ,,. j ^identifier) 

The remainder of the ML-4 syntax is identical to- the syntax 

presented for ML-3, with two exceptions. First, ML-4 has no 



(program) 

(prelude) 

(defn) 

(s true type) 

(comp deel) 

(deel) 
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• r;i. 



(m©diflc*tiofi£* (which 'we simply ■' 3 mn^ *«## ®6e**ioft tfc make 

use of)-, -and s#c®*»§; : ' (con*tru#ti#fit)* appear slightly differ- 
ent: - ^'--- ; 

-:<-<2©hS'^rtict?iJd«i>' -■ •■»'** (IsypeMame) [ ( 5 fi#l&> : r.. .? (field) ] 
(field) _,; ; :=v (cell expr,)...., ,-..,., ^, :K , 

(The ( selector) s^Jjat no linger ^ . .e^X^i^ly^peagc in the 

< field) s of a .;.(capstrpc^ion> s ^y, b^fcgandj %l the,,(d^fn) fpr 

the (typename) of the (construction).) 

Description 

We need to interpret the. /new jji^ta^J^c^s^s . A 
(program) in MJj-4 is essentially a ,.(pr|>pr«m> in, HJ>-3 ► , pre- 
ceded by a (prelude) . ■ The,, .(j^eilude^yi^^.^^^iencfi. of, type 
definitions ((defr0s) followed by ,a ; pie^en«e of declarations 
( <decl)s) . A (;<^ql) ; ^ consisting, pf ,a ^^pm^m) L and a list 
of ( identifiers, specifies; that thasje, (i4^ntifier)s are to 
assume values only- of th« J type givenv father (typeham©*) . 
Types in MB- 4 are denoted by members of few© syntactic 
classes as follows: 

(1) ■*■ <typena»e) is el&Hear- the sVwfeOl int (which de- 
notes the type consisting Of integer values) or the 

- aaiiheo ass©&4«at**3M with*- •©lie* fcype^bjf ^a . lf^le») i ' ' 

(2) A (structype) denotes a structured ^ t^fpf. (i,e. a 
type cdhsisting of structured vaiuesy. The 

( selector )s and types of the associated components 
of a value of such a type are specified by the 



M". ~ «*.-> ^ ,*--»!» -*• If" -- . "Sv -l i ""Tt; — i^a-J,*.. +«. ^^iWl^i^JMKpSr^^' i;' rv««i-«5~.; 
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<comp decl)s (component declarations) in a 
^structype). 

Observe that if we know the type of a structured value, then 

we know the type of each of its components. There are two 

basic purposes for using (typename)s: first, to provide for 

multilevel structures (i.e. structures with components which 

are structures), and second, to allow fc-r recurs ion 4in type 

definitions. We discuss recurs^e types lat^rl ' 

Semantics of ML-4 ( informal) 

(1) Data types and type definit^oi^ .v We define the 

data types that are specified by the syntactic unAts of 

ML-4. Elements of the classes. <tyj>ej*ame) and ^str^c^ype) 

define data types according to three irul^:, 

(i) The (typename) int denotes tkf v class of all 
integer values . " v '\ i ' 

(ii) Suppose ■ 1 »«».»» k are <««l«etor>* and 

t l"*" t k are "•y^tuctic items denoting data 
types. Then the (Btruetyfe^ [tisU . a rfe. s. 1 
d^nqteis the claiss of all structures with 
exactly k components wi^>4*e&ec%©*$ s 
s lf ...,s such that for each i * l,...,k the 
value | if any) contained In^thecoaiporieat cell 
selected by sv belongs to the type t.. 

(iii) Jf tii the (typename) or" ai'tdefn), then t 
denotes the type specified by the <stru,ctype> 
of that <defn>. In this case we say that the 
(defn) defines the (typenaae^ w t, 

These rules give the semantics for type definitions in ML-4. 
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© ® © 



Fig. 4.2-1. 
Objects of 
type int 



Note that according to rule (ii) , if x is a value belonging 
to a structured type t, then the types of all the compon- 
ent cells of x are determined. 

As examples, the objects of figure 4.2-1 belong to the 
type int . In the presence of the (defn)s 
p_t = [ int p ] and t = [ int a; jgt b ] , 
the objects depicted in figure 4.2-2 
belong to the type t (which is the class 
of all two-component structures with 

a-component of type int and with b-component a one-component 
structure whose p-component is of type int ) . Note partic- 
ularly that a cell constrained by our type mechanism to hold 
values of a given type can be empty. A value may belong to 
more than one type (par- 
ticularly if it is a 
structure some of whose 
component cells are emp- 
ty) . But given any value 
v and any type t, one can 
always tell whether or 
not v belongs to t. 




Pig. 4.2-2. Six objects of 
type t = [int a; p_t b] 
(where £t = [int p] ) 






A (typename) does not have to be defined textual ly be- 



-113- 

fore it is used in a (prelude). For instance, the (defn) 
sequence ;tl = ft2 c] ; t2 = [ int d; int e] is perfectly 
legal. A non trivial application is the definition of recur- 
sive data types, which arise in ML-4 when a (typename) is 
used as part of the (structype) in its definition, con- 
sider, for example, the (defn) r = f int a; r. b] . This 
defines a type named r consisting of two-component struc- 
tures for which the a-component cell can hold only integer 
values and the bs-component cell can hold values only of 
type £. Although it sounds circular, it is perftecfcly well 
defined, Values of a recursively defined type can have sub- 
structures nested to an arbitrary dept&i and fiif-^objects 
representing such values frequently contain directed cycles. 

We make three restrictions on (defn)s in ML-4. First, 
the ( selector) s occurring in a (structvpe) must be distinct. 
Second, a (typename) can be defined only once in a (program). 
Third, the (typename) int must not be redefined. Any 
(program) not obeying these restrictions is syntactically 
invalid (i.e. is to be rejected by the translator). The 
meaning of an invalid (program) is undefined. 

(2) Declarations ; As with (defn)s, the semantics for a 
(decl) does not specify any particular actions to be per- 



13 "•"wi**^*' ** *<- w..^ ^s^-rte- ^^^fi^/rt^s^^^::^,^^^^::^--. - ; ..^ -:'.'-i^'*^:~^], ? ;.v' ^^^sfesj*" 
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formed at runtime. The effect of a <decl) is to cause the 
^identifiers in it to be associated with the type named in 
the fd«cl>.-< 

In ord«Jf for^a <$ragrsm> &J*t<m**rt^*UK valid,, every 
adentifi**) occurring in acme . .im&amm*fr>m*t, af>pear exact- 
ly once jin th* (program) >a <4ec34*. ^ mm . ^.^ occurr- 
ing in some <;d^cl) muat ^e define** exactly ©«m in the <defn)s. 
Prom the abov» semantic rale* fiBP «Wfe^a, and 3 (dec!) a, it 
ife-*<*MMte ** nniqoely determiH* «A **p*W J a*iy «utpre*»ion) 
•~ t .ia * «yn*«t^ai^i*aM)(l -<**©s»««|r- *„$* iw ^bne a* follows : 

U > f 1 *^^;*^^ If it 

ijto^ (identifier), then tM» < identifier) occurs in 

•*,.■■- - .- . frxeetly • tmmiimmgtytwm- t^**fr i -$frWvWW*tto#' '■ ' ' 

reLrsll? • ^ r#S<,ioft> ' ^^ caa ** ^termined 

X Si;!^?^;. *** **** «* f tlM » <**i«**ion), then, 
<*tntefeyp«? that contains the given (selector) . 

Ui \lL? m lTT" > *? m) ** ^limemm; there are two 
> 2! la ^ n ?^ r> ? are of J?** JM! «*» < construction >s 

->< a** of the type g**e» 'l^<^k«f^^dlM^: '- : " ' 

*hu« we oa# ae^astttoe fro* tti^aifc^^ ^tactically 
valid <***•**> tha type of a«y ^en^reWi^iyr this type is i 
given by ^ecisely one (typename), I^jj^^in, .the 
presence of the (prelude) $&& m ^ „ ^^^ . 
YJa02£ - {&& c* ^£ d ]> i£ ty M xt X&J& y the type eor^.. 
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pondences shown in figure 4.2-3 are valid. 



?T" ^ ;t "i ■ i'iiiw:. 1 ' * *!?»■■» 



a &f x 
b of x 

e' of y """ 

A oj b of x 

3 

Y.tyR?[3;4] 



ribh ■-•]■■ Type 

xtype 



ikfofcer s? 



y type f6; ni 1 11 



in t 

vtype 

irit 

int 

int 

ytype 



xtype 



Fig. 4,2-3. Types of 
sample < e^pjressijQn) . 

■ " ■ ' — i... .-! . . -, ..!.! . !. — II II .. .,..,,. .. — ... m i fc 



( 3 ) Assignments : the seman- 
tics of an ML-4 ( assignment) spe- 
cifies the same runtime actions as 
its ML-3 counterpart; in addition, 
the translator is directed to per- 
form certain additional tests. An 
(assignment), as before, consists 
of a (destination) and an 
(expression). The ML-4 type sys- 
tem forces the cell referred to by the (destination) to hold 
values only of a certain type. Thus the translator must ver- 
ify that the value of this (expression) matches this type. 

A (construction) in which the components fail to match 
the types of the corresponding fields in the (defn) of its 
(typename) is an invalid (expression) and has undefined type. 
For example, if we define z = f int a; int b] , then the 
(construction) z,[l;2;3] is invalid because of its extra 
component; the (construction) j&tlrJB$.2;31J is also invalid 
because its b-component is of type z rather than int as re- 
quired. We also call a (construction) invalid if its 
(typename) is not defined in the (prelude). 



)B?fe^*5«f'f''««'-rf«?>»»i**--*- —■.«vV:'>^ W v>i>»^iW^J»re^T"»S»K?*$i>* '»•- *c t s'4Mrs »f™*>*»y*Ss^^ 
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An ■ M$*?$/r£ptappfa») is invalid if in any of its 
<f assignments the>*jype of the (expression) is un- 
defined or fails to match the type of the (destination). 
Each of these '.tuo types is given by precisely one 
(typenam*^ thews -t^ypes are- 'defined to match if and 
only if their (typeneme)s are identical. The median- 
isms jwe shall deJ^S for the translator insure that it can 

alJHayii dstermi^' whether or not a given ML^-4 (program) is 

• ;-o-- v ■ ! . --■'■. t- ;r: : n: xs 5n<. /'Tui.^ru^^-:- 

valid, ^aiere-i^ r 'i**..need for runtime type tests* nor are 
there any runtime type errors. However, a runtime error 
will occur if there is an attempt to extract components from 
an empty cell of a structured type. For instance, the ML-4 
(program) sj, ..» tint a? sj.b]y s2 * Tint cl? si x? 
x «- sjL|3i n^ i; c oj| o of_ x ♦- 4 will fail on interpretation 
of its last (assignment) (since the interpreter will look 
for a nonexistent c-component in the empty cell for b "of x) 
even though t&e ; "type^of'thV {destination) c of ~b" of x "( int ) 
matches the type or the /(expression) 4. Thus i we require 
runtime' tests "to check 'the (seleeiiori)s in ML-4. Generally 
speaking; testing for empty cells is usually mucti easier 
than testing the type of the '"'contents o"l : "a cell ""at runtime. 



If we strip off the (prelude) from a valid ML-4 
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(program), then we will have in essence an ML-3 (program) in 
which each cell takes on values of only one type. Moreover, 
the effect of executing this ML-4 (program) is identical to 
the effect of executing its ML-3 equivalent. 

Translation into BL 

To give a precise formulation for the semantics of 
ML-4, we describe the translation of ML-4 (program)s into BL. 
with the previous mini-languages, it sufficed to show the BL 
code corresponding to various program constructs > namely the 
different kinds of assignment statements. This is no longer 
sufficient in the case of ML-4, since the semantics now con- 
tains rules for typecheeking by the translator. We must 
therefore also describe the typecheeking procedures per- 
formed by the ML-4 translator' 

In discussing how the translator performs typecheeking 
of ML-4 (program) s to determine their validity, we begin by 
describing the information supplied to th* translator by the 
(prelude) of a (programy. we shall treat the translator as 
a BL procedure. As it processes the (prelude) '-, the trans- 
lator builds two component objects in its local structure* 
one component named $defns which represents the type defin- 
itions, and one named $ dec Is which corresponds to the 



- aSr! jfc !S « J !3l1fr'-'«*« , S*^''S> 1 "-' •■'- ,-.-- - *- • - -".-■«=■*• A-5. - -, * •• - «-^K*r "--- „ «.£H*MS*»«" . 'S'.-'-'-.K'lM"! 
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the declarations. $defns is a structure which has one com- 
ponent for each (typename) found in the (prelude). Bach 
component of $defns is a structure with information on the 
type associated with the (typename). For each (typename) 
defined in a (defh), the corresponding component of $defns 
has an *ri ,r field* with t^ fiumoerorco^ponents in a value 
of that type, numbered fields giving t&e (Selector >s of the 
components in the proper order, and a "val* field giving the 
types of 'Ig&i' '.£&&&&&&$.*' (fijf me^ia#'of""lin1es-.to : ; €he proper 
entries in $defnsf . The int- cca f $defns has only a 

val-componeht containing the elenaa value ' int'. $decls 

is a structure with one component for each (identifier) de- 
clared in the (prelude). If, say, the fidentifier) x is 
declared to have type t, then #*' *>-#d«pod«hf df £ddcls 
; shares -p$&k .4fa:jfaqomBmM&faJ9M friMMM t£«MMMili ,s*tf . figures 
4.-2-4M ^Jtr.%. **&&*£-& :*m .9&fl* M jf0i4kal0ki&l!Qi *p<h*bi*te. the 

the- .{$&**M»V:-.- 13** type with ( ty pename ) £ in-xf£s*«*..4..=2^5 

is - re^cujwidveQg':. t#a*fctie4r' o bs e3 E»e tfhet $d«fns i$mm~M directed 

icvcle^-in-v -tftSsS-; otasev ■ •■'..r-'c.-.u ■.-■•_•■ -:>w..-. 

" " • ] 

Once' these objedts have been constructed by the trans- 
lator, ail the information required for" echecfcing is 
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available. Each type to be associated with some cell re- 
ferred to in the (program) is represented by a component 
node of $defns. Two types match iff they have the same 



*tt i — t— r 
vox 



Fig. 4.2^4. 
(prelude) 
int x,y,z 




Fig* r -<4^2~S^ $d«fi*s *twi $decls 
structures for the (prelude) 
|£_ = [£ p* J&£ V] ; £X,y *,;£&£ m 




Fig. 4.2-6. $defns and $dec is for 
the (prelude) tl -«. : {&nt a? £2 b] ; 
t2 *. Tint cj; tF"xl; TFx2 



(typename). To describe how the translator performs the 
actual typechecking, all that needs to be shown is how to 
access the node for the type of any ML-4 (expression); once 



-120- 

we can do this, the typechecking is s traight forward: an 
(assignment) has a type error iff the nodes for the types 
of its (destination) and its (egression) are distinct. 

The type of an (identifier) X is given by $decls.x. 
The translator will mark a (program) invalid if any of its 
(identifier)* are undeclared. 13 is the node f©r the type 
of a (destination) D, then the type of the (selection) 
s of D is given by the node 0.val.s. The translator verii- 
fies a* -put ;&£ its typechecfcihg that values of the type of 
D do imdeed have s ^components, thus we ca» Ascertain the 
node for the type of any (destination) in an ML-4 (program). 
Figure 4.2-7 illustrates some sample ML-4 ( assignment )s in- 
volving only (destination) s and givsw BL typechecking code 



ML-4 code 
y «» x 



*0- 



saitt*? $decls.y,$decls.x,no 



2 «- a of x 



b of_ y .♦- z 



b qj£ y ♦• 
c £f a 21 x 



B% typechecking code 

.... . .^ ":■--- - -.'';:.. :> .-,- . .j, .i, ■--. ,.• .; ■■'-A "_ 



."i. 'i j n m 



'haa? $decls.x.val,a,no 

sajgg2 ^decls.a.gdecls^x.val.a^o 



^•cl : s.»y-.vat-/*^ilE>"' ' 

Miff T * ^ ta «y«yii'^«i<toJ**A»*o 

has;.? $decls.y.val,b,no 
£ $decla.x.val,a,no 
h»py ?decls . x. val . mi4& , c/no 



fiyKZ. $decls.y.val.bi.$decls*x.val.a.val.c,nQ 



Mnygfr 



'tw r'" * 1 1 



Fig. 4.2-7. Examples of RL typechecking. 



>»»s*f5S*«; 
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to determine their validity. A branch to the label "no" 
indicates that the (assignment) has a type error. 

If an (expression) is *n '(integer), then i4«' type i^ 
given by the node $defhs.injt. The *yfe of* a (construction) 
whose (typename) is t is given by m& node $defns.t, pro^ 
vided the (construction) is valid. To check this, the types 
of the components in the (contft ration) must match the 
(typename)s in the (structype) that defines t; moreover, 
there must he the same numbed of .-Ca^fonents in both places. 
Thus the translator can access by our scheme the node for 
the type of any (generator). As. a Result, we now see how 
the translator accesses th* node* ; l?Q[fc. the types of arbitrary 
ML-4 ( express ion)s. Figure 4.2^8 giiras some examples of 
ML-4 ( assignments containing ; mrb^jEfcry kinds of 
( expression) s; along with each (aaaigapent) we show BL code 
which tests its validity. This completes our picture of 
how the translator performs static tg^pechecking; the mech- 
anisms should be cliear fjjoaj the examples in figures 4.2-7 
and 4.2-8. 

The actual BL code generated by the translator (i.e. 
the BL code to be interpreted at runtime during the execu- 
tion of an ML-4 (program)) is similar to what we presented 



."■o;;a 
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in the section on ML- 3 . There are two differences ref leet- 



MJ*r4 "CCNie 



x «- 2 



>fSt 



Z 4r t'|21 



: SSk -.rt^Be«ifce«&ia#r^aKhi- 



game? Sdeci* . x > $defh*. ;Ut , no 



same? $deel* * s 4 $defas . t > no 



w ** • JE. ^ ^^K e ** *1 



y *- sti[b of wj] 



mast have exactly 

• .K-i^^'^owpp^enife. */ 
eg? $defns.Ji.n«$.te*ip,no 
select $def ns dpi r#t.eji»^ .- jmw» • o€ let 

component 

type t */ 



same? _ $d«cl*«w >; f<iefiw;. ? 5 ( ,.no. 
conttt ,- 2>$ ! teisp' * ; " '-■ 
.eq? $defns.j:.n,$temp,no 
select $def *gtop 



T ^ $d«f«» «r « val. * $ temp, |dec Is . w , no 



sane? $de£h* -j^val . * $ temp , $decls . x , no 

— Lj^— "n il il Mi W l i i ' ' ■ ii • ■« ■ ■ ' ■ ' 



same? $decls*y,fd#f^s.#,ao 

eg? $defns . at . a, $ temp , no 

select $$adfei*£^$j9&riMf>' • - 



i 



same? $def»s . s . val . * $ temp , $def ns . t , no 
eg?- $defiis.jt.n,$temp,no 



select $defns . t, 1, $terap 

awn?.; -$aitffaa£;flitf[^£$*toipU 
$d&<&&* . w .val . b , no 



Pig. 4*2-8* More examples of .&&' typecheeking 



ing the switch of typecbecking from runtime to translate- 
time. First, occurrences of {selection) s in MIi-3 yield run- 
time type tests., such as the BIi code has? x,b, error for 
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the ML-3 (selection) b of_ x. In ML-4 this runtime type 
test is replaced by the simpler and faster test 
nonempty? x, error, which makes sure there is no erroneous 
attempt to access component cells of an empty cell. 

The second change is that the complicated procedure 
assign3 with all its type tests is not needed at all. The 
EL, code generated; from the (assignment) y «- x depends on 
the type of the (destination) y. If its type is Int, then 
by virtue of the translator's static typeehecking we know 
that x can hold only integer values* In tfcia caae th«;®L 

CP<ie An; figure 4. 2^9 isi gen- 



clear y 
nonempty? x,skip 
const *x,y 



skip: 



Fig. 4.2-9. BL code for 
the ML-4 (aaaignaent) 
y ♦- x when y is int 



eratwi. If y is of a struc- 
tured type, then the trans- 
lator knows that its 



(selector)* b^ , . , » ,. a. 



are given : by 



s 1 = * ($decls.y.l) , ... , s^ * * f$decls.y.* ($decls.y .n) ) . 
in this case the BL code in figure 4.2*10 is generated. The 
translator can always tell whieh case applies by testing 
whether the pathnames $decls.y a*»d | j d%f*is ; iht lead to the 
same cell. The BL instruction same? $decls.y, $defns . int , go 
performs this test. A branch to the label "go" indicates 



^sasss-"**** ^jfesp^gij.f ^jfpp&wjffl&fa., *. „ f, -^va™.,*, SJ1 #. -*- ^v^**%?T*??^S#^ .. , 
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that y has » structured value aifir that the second case 

applies. Thus, by sub- 
stituting the nonempty? 
test for the has? test 



': ' . ' .-. ■U l AiM l ,u%AU!fcMMl. 1 ! 




skip* , . 

ft" 5 * V f tf"HJ 



" - ■^r *" - ""— ' " ■y* * y .jW j .ya!yit,|i,. i M c fyp * .. 



Pig. 4.2*10. BL code .£ art!* 1 
, MLr-4 (jp«4$»pej&) y *» x , 

when y $* e^tftMBJMNrtis? 



1 %ft#*th#'^ ;: e*>d¥- of fig- 
*■'.-' ''fa*** 4^*-9 and '4~.-2'-t$ 
w . «Bwicjai:xons Or cite 
■'■ e »sl!gti y procedure, we- 

yi*3ri«d *e* th*«,-*4' transistor. TM* mjaajJJetUJn da* de~ 



«srip*iiO» <xi thevt TTsusfcafrfa en o* «,.~4 into »L and places the 

semantical a«r Mfc.-4 on Stem and prfMiee g»*wd. " ' " ' 



4.f3,^,,,%i 




; v ! ft. 



Most prograiwiftg *****«««« handi^g,^s^ structure* 

have a type ey#«ita«siail** to tha% -c#«r«4r ""taw ^avllt or 
«»**:. tyj^|eeii^ p ,ia,.,da^e at translation t4**a eafcher than 
.rttntftaje* sa this aeetton ^e tceeA the dsita ; st«actuKing . 

fac;4M^»* ^;M^^i-^'W^w%slm mm^ ^tm^9^L^ .-as ,a- ?... 



»-f; 



: ^tt-l 



lajngijage awol w [Wir *#} has a relatively simple 
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treatment of data structures. The structures are called 
records , and the ALGOL W analog to an ML-4 structured type 
is called a record class. An ALGOL W record class declar- 
ation can be represented by an ML-4 (defn). Figure 4.3-1 
shows how the two languages define classes of structured 
objects; the ML-4 type with (typename) £air corresponds to 
the ALGOL W record class named pair. Structured objects are 
built in ALGOL W through the use of record designators , 
which are analogous to ML-4 (construction^. Expressions in 
both languages which build structures from the "pair" class 
are also shown in figure 4.3-?l. 



language 



ALGOL W 



ML-4 



type definition 



record pair ( integer a,b) 



P*i. r = [.int a; int b,X 

— — - ■ - - . ■ ; . ■ 



object construction 



pair (3,4) 



P«aky[3?4] 



Fig. 4.3-1. A parallel between ALGOL W and ML-4. 



There is a major difference between ALGOL W and ML-4 
with respect to these elements. Although a record desig- 
nator builds a structured object in ALGOL W, it does not 
yield as its value the object it cmtmvwfts. in fact, rec- 
ords are not even values in ALGOL W. k-recorci class is not 
a legitimate type in ALGOL W; record* are accessed through 
values of reference types. For instance^ the ALGOL W record 
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designator pair (3, 4) in figure 4.3-1 yields a value of type 
reference (pair) . ML-4 will treat reference expressions in 
ALGOL W similarly to the way ML-3 treats pointers in ML-2. 
The correspondence is depicted in figure 4.3-2. Note that 



ALGOL W 



record pair ( integer a,b) ; 
reference (pair) y,z? 
y := pair(3,4) ; z := y 



ML-4 1 



— r- 



Pair = [ int a; int b ] ; 
refpair * [ pair ptr ] ; 
re f pair y,z; 

y «- refpair [ pair [3; 4] ]; 
z «- y 



Ptr ftr 

a t 

&6 



Fig. 4.3-2. Reference expressions 
in ALGOL W. 



in dealing with 
ALGOL W records, 
we need an extra 
level of indir- 
ection (the "ptr" 
component) . This 
(at least with 
respect to our 
scheme of rep- 
resentation) is the same kind of inefficiency we encountered 
with ML-2. it is worse here, though, since ML-2 made use of 
the indirection only when sharing was needed. 

Components of a record can be accessed by selector fun- 
ctions in ALGOL W. Figure 4.3-3 
shows the correspondence between 
selections in ALGOL w and ML-4 
(z is of type reference (pair) 
in ALGOL W, refpair in ML-4) . 



language 


selection 


ALGOL W 


a(z) 


ML-4 


a of ptr of z 


Fig. 4.3-3. Selection. 
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Once these differences concerning the construction and 
selection operations have been taken into account, we find 
that assignment, sharing and typechecking in ALGOL W are 
almost identical to the "obvious" ml-4 counterparts (e.g. 
replace ":=" with "«-"). In this respect, ALGOL W is similar 
to the language SN0BOL4 described in section 3.6. 

PL/1 

PL/1 was one of the earliest languages to have compile- 
time typechecking and to treat both <tat» structures and 
pointers. Most pl/1 constructs handling these notions look 

markedly different from the 
const;ruqts we have seen in 



PL/11 



DECLARE 1 X, 

2 I FIXED BIN, 

2 S, 
3 J FIXED BIN, 
3 K FIXED BIN? 
DECLARE Y LIKE X; 
DECLARE Z LIKE X.S? 
X.I = 5; X.S.J = 6; 
Y = X* 

Y.S.K = X.I; 
Z = Y.S; 



-, ' i ■ ° — r- 

r^ r^ A-, 

6 (£)© 



ML-4 



trip * f jot i? pair s]; 

jeais = teal J? is£ kj? 

trip x,y? pair z? 

x ♦■ tri£[njU; fi^Xnit'-nil] ] ; 

Y «- trip f nil? pair [nil; nil] ] ? 

2 + g&Lr[pll; nil l ? 

1 o£ x ♦- 5r j of 8 of x f 6; 

i 2£.Y «-. i siUx? 

j of s of y »• i o_f s of x? 

k £l s £| Y *- H of s of x; 
k of soTy 4- i Qf x? 
j of z *- j of s of y? 

\ k .Qf-Z ■» !k ^f 8 Of V 



Fig. 4.3-4. Structures in PL/1. 



■ • ■ '. ~K"J-'* - ■ 
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other languages. Figure 4.3-4 shows how PL/1 handles a 
sample structure and give* an ML-*4 equivalent. We sake two 
observation* . First, all component cells of the PL/1 struc- 
tures in this exanple are allocated when the declarations 
are interpreted. With ML-4, component cells are allocated 
when the structured value is actually constructed. Second, 
a PL/1 structure assignment Ufce Y - X in fig. 4.3-4 sig- 

atrvxg tujed cewponente) a* with *&,•*! as* qH&v ?'- ■■■■*■ 

i^l'litBi 'IttfiOL *#;' ■ there ' iW "no sharing among PL/l struc- ' 
tures WitifHB introduce pointers and the attribute -BASap. 
'■tt''^%&%-^A- i <fi^l^^ : '^»eAm^ to be a pointer, i&sjr^e- 
claring a structured variable iwitb the attribute BASED (P) 
introduces a vast cpnceiH»e$ di fj sa ienu e. i tfhie variable no 
longer signifies ; -# location where structured objects tony be 
; stored* 4h«%ead"r. St plays ' the role of a ■■structured",-- type . 
Figure 4.3^>-«js^A^^^t of PL/1 declarations* Involving 

BASBD *tructure*>^d* giyfft, a corresponding *L-4 (prelude) 

arid set of AL^L ,! *f^|JapaAions. :', ^ '" : 

Altjjofjgti ^thi JQ^/1 ^%©l*rations"of :: 'f4gur4' / 443^/ ; .specify 
•allbcatiSfc of storage to hold s tru e tured v alues fand~ allo- 
cation of cojaponent cells as well) , the declaration of LIST 



-129- 



PL/1 

DECLARE (P,H,T) POINTER; 

DECLARE 1 LIST BASED (P), 
2 BACK POINTER, 
2 FWD POINTER, 
2 NUM FIXED BIN; 



in figure 4.3-5 does no such thing. BASED structure values 
in PL/1 are constructed through the use of an ALLOCATE 

statement. Under the dec- 
larations in figure 4.3-5, 
the PL/1 statement 
ALLOCATE LIST may be rep- 
resented in ML-4 by the 
(assignment) p «- ptrlist \ 

list Fnil ; nil ; nil ] ] . 
Since LIST is declared to 
be BASED on the pointer P, 
the allocation causes the 
value of P to be set to 
point to the newly-built 
structure. The result of 



ML-4 
ptrlist = [list ptr] ; 
list = [ ptrlist back; 
ptrlist fwd; 
int num] ; 
ptrlist p,h, t 



ALGOL W 
record list = 

( reference (list) back; 
reference (list) fwd; 
integer num) ; 
reference (list) p,h,t; 



Fig. 4.3-5. PL/1 BASED 
structures as types. 



P 

ptr 
i 



r — n- 



" * k I 



Fig. 4.3-6. 
Value Of p. 



this allocation is shown in fig. 4.3-6. 

BASED structures in PL/1 are ac- 
cessed through pointers. In our LIST 
example, a use of the name LIST refers to 
whatever the pointer P is currently 
pointing to (which will be the most re- 



cently constructed structure BASED on P, unless P has been 
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subseqiiantly updated) . To refer to * previous allocation. 
One must una a atf»$i#iaftd reference such as T -> LIST (which 
indicates Whatever the* pointer T is currently pointing to) . 
Figure 4*3*? draws the connection Jaetween , Wi*fi.+ ALGOL W and 
ML~4 in accessing fields of structures (it is assumed that 
the declarations in fig. 4.3-5 are still in force). 



PL/1 


ALGOL W 


'■■■ '■-" ' ' ' " ' ■" " '■**■' 

KC-4^ 


LIST 




P 


ptr of p 


1? -> LIST 




t 


ptr of t 


LIST. NEW 




p.num 


num of ptr of p 


T -> LIST. 


hum 


t.njam 


■.■-•.; nnat of jptoC;y-o^- % : 


Pig. 4.3-7. 


Accessing fields. 



The meaning of assignment in PL/1 is similar to ALGOL w 

■ ■■<■■ , Y 

except for its handling of structured values (which ALGOL w 
does not choose to handle) . in tliiS/easa, aS we have said, 
PL/1 copies rather than induce sharing. All sfcariifg of data 
in PL/1 is done through pointers . 

TyptfttlMrtfciag in PL/1 differs J»s» 1CL--4 and; ALGOL Win 
one ma|or .mawia*.: titat of , pointers v -3fiB*-.«ALG0L : 1t ;l translator 
insures that a reference value oai*:, pKiat to records only 
from one record, class? if cl and c2 are distinct record 
classes, then any attempt to make a value of type 
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referenc e(cl) point to a record from class c2 will be caught 
by the translator and marked as illegal. The type system 
for ML-4 imposes essentially the same restrictions. How- 
ever, a variable of type POINTER in PL/1 can be set to point 
to values of any type at any time (including nonstructured 
values) . This causes difficulties of the same kind that 
static typechecking is supposed to eliminate. For example, 
in the PL/1 program segment of figure 4.3-8, the assignment 
P = Q is legal, even though P points to a struqture of type 



DECLARE (P,Q) POINTER; |pL/l 
DECLARE 1 Ml BASED (P), 

2 J FIXED BIN, 

2 K FIXED BIN; 
DECLARE M2 FIXED BIN BASED (Q) ; 
ALLOCATE Ml; 
ALLOCATE M2 ; 
P = Q; 
Ml.K = 5; 



ML-4 



ml = rint j ; int k ] ; 

ptrml = [ml ptr ] ; 

ptrm2 = [ int ptr]; 

ptrml p; ptrm2 q; 

p ♦- ptrml [ml [ nil ; nil 1 ] ; 

•3 *" ptrm2 [ nil ] ,* 

p «- q; 

k £f P tr of P 4- 5 



Fig. 4.3-8. Lack of type restrictions on PL/1 pointers. 



Ml and Q points to the integer M2 . The reference to Ml in 
the following line (Ml.K = 5) designates whatever P will be 
pointing to (which is the integer M2 since P has just been 
assigned the value of Q) . Thus there will be (depending on 
the implementation) a runtime error or at least an erroneous 
result as an outcome of the attempt to update a component of 



; :jg=JPU. 
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the integer value M2. The ML-4 translation of this program, 
also shown in figure 4.3-8, is invalid since in the 
^assignment) p «- q the types fail to match ( ptrml vs. 
EtrraZ) . if in the PL/1 program we had declared M2 to be 
BASED on P, then the corresponding ML-4 (program) would have 
two conflicting declarations for p, which would also render 
it invalid. Thus we see that the typechecking system in 
PL/1 fails to catch a whole class pf programs which might 
have runtime type errors. 

ALGOL 68 

The treatment of data structures fl$d pointers in 
ALGOL 68 is linked to an intricate system of typed and type- 
checking. ALGOL 68 is a difficult language to leatn and 
understand; the defining documentafcipn [VWi} 69; VWij 73] 
presents an intimidating formalism to the uninitiated, 
powever, there are works (e.g. £Liad 71]) which are immense- 
ly helpful. 

Types in ALGOL 68 are called modes . The modes of rele- 
vance to u» are the mode int (integer values) and the modes 
built from the raode-'cohstructors struct and r^f (structured 
and reference values, respectively) . life "describe a corres- 
pondence which assigns ML-4 types to ALGOL 68 modes: 



^sajf^i*-*!^.-.^ 1 .- 



**-„*■ s*^-*^ - -j^*^-?*^ ^*#%&$m^%&¥*^>** f >* •"'^ *"* **■*■**•<**< •*-*«#* «•* ■*» » ^^ *«-■ 
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(1) To the ALGOL 68 mode int we assign the ML-4 
type int . 



(2) If M 1 , . . . «M. are modes and S. , ... . ... ,S. 

(the equivalent of < selector } s) , then 

struct (M_ S-,, . . . ,'jl MjW a^SftSte 

[T n S.;.... ;T. S, ] , where the T. ar€ 
1 1 , . k . k ., .. i 



are tags 
to the mode 
/the mj»!t4 %ype 
are the ML-4 types 
GofrespondinjJ to the M*. 

(3) If M is a mode then to the mode ref M we assign 
the type [T ' pt'tf; where T is th# Ml^-4 type^Tbdr res- 
ponding to M. 

Mode-declarations in ALGOL 68 are just like type definitions 

in ML-4; for example the mode-declaration 

mode pair = struct ( int a, int b) is equivalent to the ML-4 

<defn) pair = [.int a ; int b] . 

A declaration in ALGOL 68, besides associating an iden- 
tifier with a mode and imposing type restrictions on the 
rest of the program, has a two-fold runtime effect. Con- 
sider a declaration of form M X = E, for instance int x = 3, 
where M is a mode, X an identifier, and E an expression 
yielding a value of mode M. This declaration first binds X 
to a newly-allocated cell. Second, it places the mode M 
value yielded by E into this cell. What is" peculiar about 
ALGOL 68 declarations is that this value can never be 
changed. It may, however, be a reference value (i.e. the 
mode M is ref N for some other mode N) ; in this case it 
refers to (points to) a cell holding vaiues of mode N. This 
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latter cell (and not the former cell) can be updated by the 
assignment operation in ALGOL 68. ' ^Thjao the cleaning of 
assignment in ALGOL 68 differs from aeeignment in the other 
languages we have discusses*, ilote <tjM ^identifier whose 
declared ojjpde i# not a are€e*e»ap mod* aeurves isseentially as 
a constant. An identifier of Mode ref N in ALGOL 68 plays 
the same role as a variable of type Kf in another programming 
language. 

The specific definition of ALGOL 68 assignment is as 
follows: let E be an expression yielding a value of mode M 
(M can be arbitrary) and D an expression of mode ref M. 
The value of D is a reference to a cell which can hold val- 
ues of mode M. Then D := E is a valid assignment and 
specifies that the mode-M value of E is to be stored in the 
mode-M cell referred to by (the value of) D. 

A particular kind of ALGOL 68 expression, known as a 
local generator, specifies allocation of a new cell when it 
is evaluated. If M is a mode, then evaluation of the local 
generator loc M causes a new cell (which can only hold val- 
ues of mode M) to be allocated. fEfee value yielded by jLoc M 
is a reference to this new cell and therefore belongs to the 
mode ref M. 



y^^0^^^f^f^^^^^.^^^'^^:.M:^h 



s* xg^^^r f : ^''^^^ir^M;?^;; 



,*.. h ^ f .*w rt* 



* , » - ■ * t - -?».»: 
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To obtain a variable in ALGOL 68 which will take on 
values of a mode M, we must declare an identifier X of mode 
ref M so that assignment can change the roode-M values . 
This may be accomplished by means of an ALGOL 68 declaration 
of form MX, which is really an abbreviation for the dec- 
laration ref M X = lp c M. Consider, for example, the 
ALGOL 68 declaration int x (equivalent to the declaration 
ref int x = lp c int ) , whose effect is depicted in figure 
4.3-9. The identifier x, which is declared here to be of 

mode ref int, is 



&LS3L 



3S.\ 



i int x 

rej| int x *= loc int 



} 



ML-4 



re fin t - [int ptr ] ; 

Eefint x; 

x ♦* ref int [ -ni l] 



ft* 

1 



Fig . 4.3-9 . Semantics Pf the 
ALGOL 68 declaration int x. 



bound to the upper 
cell; the lower cell 
is allocated (by 
evaluating loc int 
in ALGOL 68, and by 
evaluating the 
(cell expr) nil in the (construction) ref int [nil] in 
ML-4) ; and the upper cell receives as (permanent) value a 
pointer to the lower cell. Subsequent execution of the 
ALGOL 68 assignment x := 3 would place the value 3 in the 
lower cell; therefore its ML-4 equivalent is the (assignment) 
ptr of x ♦■ 3. The static typechecking rules for ALGOL 68 



^^^^f0S0". ■^5$*'^ ^■'-■^^"■^^^■■^- '■;^- ■ ^■^i^ > l^-}-,^&^^*#? r?^&&l&g$&%0 ^^K*^!.; ^S^,;^^ -^'-' 
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insure that any assignment attempting to place a non-integer 
value in the lower cell is detected and indicated to be 
invalid. 

There is one aspect of the ALGOL 68 type system which 
is more lenient than the ML-4 system. Unlike PL/1, no type 
errors can arise from this loosening. Consider the assign- 
ment y :- x, where both identifiers x and y have been de- 
clared to be of mode ref int . This assignment specifies the 
updating of the mode int cell pointed to by y. But the 
right-hand side, which must the& supply an integer, ya^sj,' is 
of mode ref int ; according ijpo ML-4 iml^^rthe .asaiigali^nt is 
to be rejected by the translator as invalid. However, 
ALGOL 68 recognizes that the ref int value of * points to an 
int value, so all that needs to : tie 4m« , tg ^obtain the re- 
quired integer value is follow the pointer x. This process 
is called dereferencing . In general, the procedure for ob- 
taining a value of a desired mode from a value of some other 
mode is known as coercion or conversion . Thus, in the 
ALGOL 68 type system, if the left-hand side of an assignment 
is of mode ref M, then the assignment is valid provided the 
right-hand side is of mode M or can be coerced to yield a 
mode M value, in our case, the procedure which translates 



■*» 



W^^^V^*#**« I, *» -* * r *— ►**-*~*| £*«**-**«!a6fe#fefc&'^^ 
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from ALGOL 68 into ML-4 must recognize that dereferencing is 
called for, mark the assignment y := x as legal and gen- 
erate ML-4 code which takes the coercion into account. Of 
the three assignments in the example shown in fig. 4.3-10, 
coercion takes place only in the sec6n<3 ohe (whiere y is 
dereferenced). The y on the -'rightl^and^iitfe -here; is trans- 
lated into the ML-4 (expression) ptr of y, yielding a valid 
ML-4 (assignment) i 



ALGOL fi»k 

— .i.iiti-. ■nMw^l'.MiM^iii'lSTatflitlnBTT'' 



«**■• 



int y,:S8r* " 

?' V*= yr 
. y != 4; 



>. y * 



ML-4 I 



refinfc'--«= ^'IftjK ptr] ; 

"ih^ "xT" " "-"-"- •■- 

refint y,z; 

x *■ 3j . ,. ....... 

y ♦" refin% f nil 1 ; 
z «- xefintrnill ; 
ptr of y ♦- x; 

ptr of z #- ptr of y; 

ptr' "of" y <h 4 



(J) ^ "T' 
^ f*r ftr 

© ® 



Note that the mode of 
x is int , and the mode 
of y and z is ref int. 

The concept of 
structured values in 
ALGOL 68 is essen- 
tially the same con- 
cept when taken by 
itself as in ML-1 and 
ML-2 (as well as PL/1 
and QUEST) . Sharing 

arises only through the use of reference modes; assignment of 
structured values is done by componentwise copying. Figure 
4.3-11 gives an example. The mode of z is paix, ; the mode of 



Pig- 4?.3-'1j!>* An example of 
coercion in ALGOL 68 . 



t*fS£M:r£Ui k ?Gjt$^e*Zx0,i J&kJ '■ 
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x is ref pair . The expression (5,6) in the declaration 
for z is called a structure display and simply gives values 
for the components of z. 



ALGOL 68 1 



mM safer. = .§££*££ (is& »*» ? 

:r z •* (5,6); 



J3§xr x; 

X :=* Z;' 



ML -4 | 



j?4i , r * lififc a * lat b]j 
refpair = r fffrir ptr) £ 

pair z; ..ref r pa ir xr 

z «- pair f 5; 6} y 

x 4- refpair [pa^Lr [n^L;^] I ; 

a of ptr of x <#■ a a£-«f 

b of ptr of x *-b'Oj:';.z'' 



■; ^%i.i i f i i m i) 



> 






b 

,1 



Pig. 4.3-11. Struetwfe assigniiwrot 
in AiaSOL 68 . 



The selection of components from a structure in ALGOL 68 
is syntactically identical to ML-4. In fig. 4.3-11, the sel- 
ection b Of z, which refers to the b-component cell of z, 
is of mode jnt . There Is a major complication concerning 
selection in ALGOL 68. We can legally form the selection 
b of x, where x is of reference-to-structure mode. The mode 
of the selection b of x is re f int , not j^ t even though 
the b-compdri«nt cell for the structure pointed to by x in 
figure 4.3-11 is of mode int . We say in this case that the 



0^^:0£%P^F*&$*;;* 



'.-A ^ , i! «. B ., »,-.-. ^ <•.! 



p^^w^»ssfs*«^j>4^«fa^%«^*Kst- 
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pointer is distributed over the components (in ALGOL 68 
terminology, x is "endowed with subnaB»s") . Thus, for ex- 
ample, the assignment b o| x ;*= a £f z is legal; in the 
ALGOL 68 program of fig, 4.3-11 it would place the value 5 
into the b-component cell of the structure pointed to by x. 

Unfortunately, the "obvious'' traj^lafcion into ML-4 
fails. The ML-4 type rsfint . defined -as yj^tptr J, corres- 
ponds to the mode rj^f a^ , but in fig, 4,3^11 there is no 
cell of this type £c associate, to, the ^def^^tion) that, 
corresponds tq tbe^ ALGOL ^% selection b ;£fe x. Thus, in 
translating from ALGOL, 68 into |^^i^ #uc^ ce^lla. must be 
added to the picture (these cells will hold pointers to the 
individual components of the structure referred to by x) . 
The corrected translation mechanism is shown in fig. 4.3-12; 



ALGOL 68 i 



M° d e pair = struct ( int a,b) ; pa^ Lr «• [int. a; int b] ; 



Xf 

a of x := 3? 

b .of x :* a Of "XI 



T 



t 






JML-4 



£fHl& ". liffit Ptr]; 



— "S» 



»i%»f rif int a? ref Iht b] ? 



rej&ayir x; sub^ir x$sub; 
x $ sub •" 5^Bft^(p^§{^^« a oiptrofx]; 

"'" ''"'"' ' - v -■••* ' ^&£m£M^>3 &--o f £1* gf x]} ; 

P tr £t a £■£ x$sub '.*• 3; 

ptr -j£f%- §£ : xf sib i^ptr^lf a of x$sub 



Fig. 4. 3^2 , Of ff^^^gg^&atgrs in ALGOL 68 . 
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for each reference-to-structure identifier x we add to the 
local structure a reserved identifier x$sub to hold the 
subnames (distributed component pointers). By looking at 
the local structure pictured in fig. 4.3-12, we see that 
there are two ways to access component cells of the struc- 
ture pointed to by xt through x (with <iiei^itt*tion) 
b of ptr of x) as -when updating .*&£. structure itself by com- 
ponentwise copying; or through x$sub (with (destination ^ 
ptr og b of x$sub| as when explicitly selecting from x using 
subnames. Note that our trmnslatioft c ot HBok as to the stip- 
u la t ions set by the ML-4 static tyjpe^tin^tiitg syllteBi. 

We give a final ALGOL 68 example, illustrating a re- 
cursive structured mode. The example is shown in figure 
4.3-13. box is a structured mode, recursively defined, and 
a and b are of mode ref box; . Note that the mode of the sel- 
ection n ojf a is ref ref box . The only coercion in the 
program occurs in the last assignment, where a is deref- 
erenced. A recursive mode def |jaiti©ar,«^h, as 
mode b^dbo x «= struct ( int v, badbox n) would be illegal; the 
"ref" inside the definition of the mode^ ftox is necessary 
since there is no implicit nil. in ALGOL 6§** modes as there 
is with ML-4. 
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Thus we see that even with a language as complex as 
ALGOL 68, we can use ML-4 to make clear its approaches to 
the semantics of data structures. 




bo . x = tint v; refbox n] ; refbox = fbox ptr] ; 

subbox = f refint v; re f refbox n] ; 

refint = fint ptr]; refrefbox = frefbox ptr]; 

refbox a,b; subbox a$sub,b$sub; 

a ♦- refboxtbox[nil;nil]] ; b <- refbox f box T nil ; nil ] ] ; 
a$sub *- subbox [ refint [share v of ptr of_ a] ; 

refrefbox [ share n of ptr of a] ] ; 
b$sub *- subbox [ refint [ share v of ptr of b] ; 

re'frefbox [ share n of ptr o_f b] ] ; 

ptr of vof a$sub *- 8; 

ptr o_f n of a$sub *- b; 

v £f P tr of b ♦- v o_f ptr oif a; 

n of ptr of b m of ptr of a 



Fig. 4.3-13. Pinal 
ALGOL 68 example. 



Completeness 

In this chapter, we defined the mini-language ML-4 and 
used it to model data structuring facilities of the lan- 
guages ALGOL W, PL/1, and ALGOL 68. As in the last chapter, 
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we close with a few remarks on the completeness of our cov-*- 
erage of the approaches to data structures found in these 
three languages. 

With ALGOL W, as with SN0BQL4 in the previous chapter, 
we coveiped nearly all the data structuring facilities thor- 
oughly, .¥f%£k the exception of arrays. We comment on arrays 
and some nf their Special issues in Chapter 5. 

For Mi/1 and ALGOL 68, our treatment is far from com- 
plete. This is to be expected because -^ rthe, ^ shteer b^lfc and 
complexity of these two languages, ^here~ arff numerous 
features dealing with data * ' a^ructwases which we htfve not de- 
scribed. Yet we claim that those *e*t««e?s wlKtSh we did de- 
scribe in PL/1 and ALGOL 6# canst^iu^e? tps "heart" of their 
data structuring facilities? thus our description of these 
features should make clear the underlying semantic approaches 
to data structures in these languages^ as well . 



wfw;*!. -i «**■«,.. - '_ - - s #» •*' "**»-**v » -"- —' "*-v-> »w*^#i^ «»>b*$HF^^ ' v ' 
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Chapter 5 
CONCLUSIONS AND, E3CFENS JONS 

5.1. What we Have Done 

There are a large number of ■ programing languages which 
work with data structures. Because of the variety of ap- 
proaches found in these languages, many subtle but important 
semantic distinctions crop up. With roost languages, the 
semantics (including in particular the semantics for the 
data structuring facilities.)^ are d^scEibedi .,w^ OJ ^^ v -,Jr. n 
English. We consider such descriptive met 
for our goals. Since in many cases thjey fajLl to make clear 
some of the important semantic principles such as sharing. 
As we have seen, a misunderstanding of the interaction be- 
tween^ notions such as assignment and sharing can lead the 
programmer into erroneous conclusions about the effects of 
programs. 

We have therefore deveiJLopjed in thl-s t^esi.s a methcd- 
ology fpr describing the semantics of da£a structures in 
programming languages. In order to precisely describe mech- 
anisms found in programming languages which handle data 
structures, we made use of the base language model, which is 



^w^^^i^Mr ^** ^ ■ ^'^I^^SrW^- - 
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an interpretive model for fdrm&l semantics . The base lan- 
guage model is essentially a matftei^ieltl formalism for 
modeling the changing states of a computing system on Which 
various computations are performed. A mathematical treat- 
ment of the base language model is found in the Appendix; 
our approach emphasized the use of the base language as a 
programming tool similar to many conventional assembler lan- 
guages. A major advantage of t^ blasle Itingyage model over 
other formal semantic models -as ~raMtfc it msjuiipulates data 
obj ects of a sufficiently general n«*fai?e twat we can make 
direct use of Its data rl»p^e«enta'€l£6iis in our work without 
need for special encoding mechanisms. 

The main portion of this thesis was concerned with the 
presentation and use of a Series of mini -languages . with 
these mini-languages, we isolated the relevant conceptual 
abstractions such as assignment, value, construction, selec- 
tion, sharing and typechecking. The mini-languages provided 
a "Mgh-leVel* descriptive veMcle i wtiieli mMflfe It simpler and 
more convenient to talk about semantic" ilsuiits relating to 
data " stfcue^ures * 



The basic structure of our methodology was to first 
make clear the semantics of our mini-languages by specifying 
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their translation into the base language. Once this was 
done, we no longer needed to think in terms of the primi- 
tive operations of the base language. We were then able to 
describe the semantics of data structuring features in some 
programming language by simply using the appropriate mini- 
language to describe how the relevant mechanisms worked. 

In treating the data structuring semantics of several 
programming languages, we gave mini-language code into which 
constructs of these languages are translated. Determination 
of this mini-language code presents difficulties when the 
semantics of the source language is incompletely or ambigu- 
ously specified, reflecting the inadequacy of the descrip- 
tive methods in use. Of course, once we have obtained a 
consistent translation into the right mini-language, we have 
an unambiguous semantic specification of the relevant con- 
structs. 

Using the techniques we developed, we described the 
data structuring semantics of a number of representative 
programming languages. With the simpler languages, we were 
able to give a nearly complete treatment of the data struc- 
turing facilities. As to the more complex languages, we 
were able to cover most of the fundamental approaches to 
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data structures without getting caught up in the intricacies 
of features of relatively little semantic relevance to the 
issues we are concerned with. In the next section, we talk 
about some of the areas that were left uncovered. 

5.-?,« j:¥£tte. w pr;K 

•There are a number of semantic " areata^ that we have not 
treated. In order to cover these area*/ we would need to 
develop new mini-languages with additional mechanisms, in 1 
this section, we give brief mention to two*, *«eh ■ areas and 
what kinds of new mechanisms are required to treat them. 

The first uncovered area is unions . with the type sys- 
tem of ML-4, every cell is constrained to hold values of 
only one type. In many programming languages, this restric- 
tion is weakened somewhat by defining union types. If type 
t is the union of types tl and t2, then a cell of type t can 
hold values of type tl as well ate values of type t2. For 
example, suppose we declare z to be of type t in some lan- 
guage that admits union types, and suppose that the express- 
ions el and e2 yield values of types tl and t2, respective- 
ly. Then both the assignments 2 s« el and 2 :» e2 would 
be legal. This capability is not within the reach of the 
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type mechanisms we developed for ML-4. Suppose we declare x 
to be of type tl. Then the assignment x :■«= z can be exe- 
cuted withotit typW error pr^ei#e^ w^teti 1 ther v^lui bf zf is of 
type tl rather than Of type t2. W tir c^Sefr to add *to our 
mini-languages a capability to haWdle '' tmf ein%; scW kihft of 
additional runtime type' testing ^^iieieha^isifln^^bV^ntrb^- ; ° x 
duced into the design of the language i £ : ^ 

The second uncovered area is arrays. The type system 
of ml-4 is simply not equipped to deal with arrays whose 
subscript bounds are flexible. The type of such an array 
would contain structures having differing numbers of com- 
ponents . A structured type in ML-4 requires a set of selec- 
tors which is known to the translator and cannot change. 
Even with unions, we are no better off. For instance, the 
type consisting of all PAL tuples could not even be expressed 
as a finite union of ML-4 types, since a tuple can have any 
one of an infinite number of selector sets ({1}, {1,2}, 
[1,2,3], ... , {1,2, ...,n}, ...). 

There are many other complicated issues concerning 
arrays, such as different array type concepts, change- 
ability of bounds, and assignments between fixed and flex- 
ible arrays. All of these issues introduce new complexity 
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into the language, requiring the development of more techniques 

To sum up* our methodology lor dfcscjribAng data struc- 
tures has special advantages from each of its two portions. 
The use of the base language model provides for a precise, 
formal characterization of the semantic rules of the lan- 
guages under study, while our mini-languages provide the 
convenience of high-level descriptions of the actions being 
modeled. In order to describe any programming language 
feature, all that needs to be done is construct an appro- 
priate mini-language which handles only the concepts direct- 
ly relating to that feature. The syntax and semantics of 
such a mini -language are naturally easy to work with and 
understand. By specifying translations from source lan- 
guages into the mini-language and from the mini-language 
into the base language, we gain a precise but conceptually 
clear characterization of the semantics of the features 
we wish to study. 
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Appendix 
A MORE FORMAL TREATMENT OF BL 

A.l. lqfaffl?gyt^.,fi|t>ii, 

An interpreter 'State- embodies the injojcmation present 
at a given time in the computer system we ease modeling. In 
this section we describe in detail the structure of BL- 
graphs representing ititeurpretser suites on -rfttaB. 'base language 
model. -Ohm treatment here differs somewhat -£oom.. fcpenn 71] 
and [Amer 72]/ but is essentially equivalent , In the next 
aaction we formalize BL-graphs and tfc# Wg* .r*«sJ^<stions. - 

We assume that the reader is familiar with the concept 
of process as a locus of control. A process is represented 
in an interpreter state by a BL-objeet which we call a site 
o,f activity , or SjOA. The BL-graph for an interpreter state 
is essentially a collection of SOA's. The root nodes of 
such a BL-graph are the root nodes of its SOA's. Thus an 
interpreter state is represented 
by a BL-graph whose skeletal 
form is shown in fig. A. 1-1. 



We now describe the struc- 
ture of the individual SOA's of 
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Fig. A.l-il. Skeletal 

structure of BL-graph 
for interpreter state 
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an interpreter state. A SOA is a BL-object with four com- 
ponents: 

(1) The ep_-component is a local structure , a BL-object 
representing the environment in which the SOA's computation 
takes place. (The name "ep" is an abbreviation for environ- 
ment pointer.) Components of a local structure represent 
variables and temporaries used by the computation. Nearly 
all the BL instructions executed as part of the computation 
affect its local structure. We allow for the possibility of 
different SOA's sharing the same local structure, but usu- 
ally the local structures of the different SOA's are dis- 
tinct. 

One distinguished SOA has as its ep-component a BL- 
object known as the universe . The universe represents the 
system-resident information present in the computer when no 
computations are in progress. Generally speaking, this in- 
formation is independent of which computations are currently 
active or how far individual computations have progressed. 
This special SOA stands, so to speak, at the head of the 
system call chain, so that every process can trace its an- 
cestry back to it. Access to the data in the universe is 
passed from caller to callee, so whatever access a partic- 
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ular SOA has to the universe is determined by the call chain 
leading back to the one distinguished SOA. 

Two kinds of objects are found as components in the 
universe: data structures and procedure structures , Each 
kind of object can have objects of either kind as compo- 
nents. A data structure in the model can be any arbitrary 
BL-objectr a procedure structure is a special kind of BL- 
objeet representing a procedure expressed in the base lan- 
guage. A BL instruction is easily represented as a BL- 
object; for example, the instruction const 3,x is depict- 
ed in figure A. 1-2. The components 
with selectors 1,2,... of a procedure 
structure are simply representations of 
its instructions in order. A procedure 
structure may also have components 
which are procedure structures for nest- 
ed procedures. Figure A. 1-3 illus- 
trates a skeleton procedure structure for a procedure p 
with one procedure f nested inside. 

(2) The ip- componen t of a SOA gives the instruction 
currently being executed by the SOA's computation, as well 
as the procedure containing this instruction ("ip" stands 
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Fig. A. 1-2. A 
sample BL in- 
struction as 
a BL^object. 
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for instruction pointer) . The ip-component is a two- 
component structure, whose proc-component gives the current 
procedure structure from which 
instructions are being executed, 
and whose instr-component gives 
the number of the instruction 
currently being executed in 
this procedure (fig. A. 1-4). 
Thus the instruction currently- 
being executed within a. SOA s_ 

is given by the dotted pathname ip.proc* (ip.inst) , taken 
relative to the root node of s. 
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(3) The stat -component of a 
SOA, which gives its status, is an 
elementary object with the value 1 
when the SOA is active (i.e. curr- 
ently processing instructions) , 
if the SOA is dormant. 
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Fig. A. 1-4. ip- 
component of a SOA 



(4) The ret-component of a 
SOA s_ shares with the SOA that invoked (created) js. When 
s_ executes a return instruction, the SOA given by the ret- 
component of s. is activated; the current SOA is put to sleep. 
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With the structure of an interpreter state given above, 
we can proceed to the next section # which describes how the 
BL instructions transform interpreter states. 

We give a formal mathematical definition of BL-graphs. 

Suppose the sets BLEM (elementary objects) , SEL (selectors) 

and IXODES (nodes) are given. For our purposes, ELEM shall 

oonsiit-.of : " iii^e^c #r truth values, real numbers and strings; 

SEI* shall consist of integers and strings? and MODES shall 

be an arbitrary countably Infinite vfat. : / ; . :s^inga ; are tafeen 

over tone suitable alphabet Which includes the alphanumeric 

characters together with some special character* . A 

SL-<?yi^, jovftr :. ■■■ these ■ . three sets is a 4-tuple g * (XT,R,A,V) 

in Which i ■....,.,„■ 

: ^,^<£doe,in.uee) is a finite subset of KODES? 

R (root nodes) c Ut 

A -I(*iee#* ' c u x SEL x JB» , ..' 

V : (valuations) e V % 



we interpret (a,fr,0) € A to meai directed arc 

with selector 9 leading from node a to 'nod* k pj 
(or; 8) € V to mean a Is a leaf node with -elementary value 
8. A BE-g^aph 9 must satisfy the following lour' conditions: 
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(1) If a 6 U, a € SEL, then there is at most one P € U 
for which (a,a,P) € A. 

(2) If a € U, then there is at most one 6 € ELEM for 
which (a, 6) € V. 

(3) pr (A) n pr (V) = 0, where p^ is the first- 
component projection mapping. Equivalent ly, 

V a € U: ~[36 € ELEM: ((a, 6) € V) 

& a(a»P) € SEL x U: (a,cr,p) € A]. 

(4) D (R) = U, where D is the reflexive transitive 
closure of the immediate-descendant mapping 

D: 2 -+ 2 defined by 

D(S) = {P € U: 3 a € S, a € SEL S.t. (a, CT ,P) £ A}. 

Property (1) insures unique selection, i.e. that the selec- 
tors on the arcs emerging from a node are distinct. Prop- 
erty (2) asserts that no node may have more than one elem- 
entary value. Property (3) says that no node may have both 
components and an elementary value, i.e. that elementary 
values can be attached only to leaf nodes. Property (4) 
states that every node of a BL-graph is accessible along 
some directed path of arcs starting with a root node. 

We now give a formalism for defining transformations on 
BL-graphs. The formalism is based on [Denn 74]; it makes 
use of a set ID of identifiers and a mapping 
v : ID U ELEM U NODES -♦ ELEM U NODES which assigns values 



.?■#?-:-■*■ •■; -sr- ^^ ^K^^sie^^ 
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to identifiers and acts as the idtoti^y J^u^ion^on elem- 
entary values .and nodes . A ^fe. frsffyfo^l^n, *»P S a 
BL-graph g ■* CU»R,A#V) into a new -gsaph ■#* * ptf'iR' ,A' ,V ) 
and updates the valuation mapping v into a new napping «j ' . 
The notation y[cx/x] means Xy. (y*x •♦ a, ferae ■-» v(y))# i.e. a 
mapping equivalent to v except that it maps x into a. 

iPtie following basic ±£wmfo£)mt&lDm?-.t^*mi&li&^ 
functions are defined for arbitrary BL-gjcaphs: 

AddElemfa,d) : fa**ihed provided a €Hl, 8 CiEtflBM,'' 

Where a « v(a)« 6 ■» v(d).l 

V> =» V U (fa,6)} f , V- * tr, R* **«, -A' W-Ji; v » » v- 

" BeieteB^smila ^df ; '■ td^bi**a- pir^^ and - 

(a, 6) € v, where a =» v(a) # 6 = y(d)] 

V = V - {(a,5)}» U* = U, R' = R, A' * A, v ' = v- 

Add&rc(a,s,b) ; [defined provided a,$ ..€' 'U, ct € SEL, 

where ,a *.*»{&/ -o,-**,^*)^ ^^ v^b) ■] 

A' = A U {(a,a,P)l, tJ' * U, R* > R, V « V, v' - v 



DeleteArc(a,s,b) : [defined provided a*£ € U, a € SEL and 

~"" (a;<r .W € A. ***** tt *' v fa) 3 / a * Vf»> , 

P = v(b)] 

A' = A - { (a,a^) ]* U* « U, R* eR, V - V, v' ■ v. 

Deletecomps (a) ; -fdelined providedPa :k € VI, 'WtferefNi' * v (a) ] 

A J .-«=. A n ffc|U ~ ,|ffc)) * ^SBI*,.* P) , £}'\ *"ffir5; R^'"-** Rt ■,,, 
V c V, v ' = y. 



^^^fi^^^': 
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Prune : 

U' = D*(R), R' = R U', A' = A (U* x SEC, X XT), 

V = V fl (U« xEL|B) ( v' = :^- , 

HasComp(a,s) ; [defined provided "a € V, a € SEL, 

where a = v(a), a = v(s)] , 

if ^P € U: (a,g,0) 6 A then true efo c.fajLge y 

Comp (a. s) -» bV [defined provided cr ■£*«*■> -<*->'£ SE£ asad 

Has Cong?' ( a, s) = true i^e, . 3^, € U: (cx,a,P) € A, 
Where a = v(a) ; a * v(«) ] 

let p € U such that (a,a,0) € A; 

v' = y[£/b], u ' = u « R * - R# A' * A, V, » V. 

HasElem(a) : Refined provided a € U, wher« a * \»t») 1 
if 36 6 ELEMs (a,6) €V then tci«e * *3j— -sf«ft»i . 

Elem(a) -» d t [defined provided a 6 U an^1lasiBf«m(a) = true 

i.e. 36 € ELEM* .(a,<t^ fi *^ fgNRSH a » v(a)] 

let 6 € ELEM such that §d„84 -♦ typ?-* I > 

v' = v[6/d], U' = U, R* = R; *' »J*r-« r *?V,--. 

NewNode -» a : 

let a € NODES - U; 

v' - v[a/al, U' = U U {a} . R' = R, A' f A> V' = V. 

MaKeRoot(a) : [defined provided pt € U - R , jwhere a - v (a) 3 
R' = R U [a], U' = U, A* = A, V = V, v' = v. 

RemoveRoot(a) ; [defined provided ct € R c U, where a = v(a)] 
U' = U - fa], R' - R '"•'"•- {a}. A' * J: -k|W- c V. V' * v. 

The following trans fpr^t^on^ aj;e qo^pqpites of basic 
transformations: „. , . 
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NewComp(a,s) -» b; 
NewNode -♦ by 
AddArc(a,s,b) . 



[n.b. the semicjplon indicates com- 
position Of tranWIoxmationa , with 
application , in fhe , order shown ] 



DeleteComp ( a, a) : 

if HaaCOmp(a,a) 

tjhen fCOn^(a>a> -» b? 

DeleteAfc(a,a,b> 
Prune} . 



[the* coagfcjsxte transforma- 
tion in, the set braces is 
■'■•qfagtMfitt *«i«- node de- 

'no€eii*w a has a component 
witn*' 3 '! elector denoted by s] 



MakeEmptv(a,a) -» b : 
if Ha«Comp( a, »> 

then fGaaqifa,a>) -♦ far 
i^ HasElem(b) 

then f.El*Mb), <♦ dr ;~ 
DeleteBlein |b>dt } 
else f Del»t«coiaps (b> ? 
Prune} } 
else Newcomp(a,B) ■♦ b. 



fmakes b denote an empty 
leaf ^no^de which is the 

■ ai^iponent''e^' ! the node 
;a«JB0ted**#i5aJ 



We now have the machinery to describe the action df the BL 
interpreter, the baaic action ia "to picx' a root node, which 
will be some SOA, then to execute the next instruction 
(given by the ip-component of the SOA) with respect, to the 
current local structure (given by the ep-component of this 
SOA) . Figtire A.i-1 illUStratea the ''sltele'tal stitocture of a 
sample SOA, In the procedure we will give to define the 
action of the interpreter, special names are uaid tfe des- 



■^il0^*{^::.:^j&^^SJ^ -. ^ % u- if.,^.^,^, # ?'*<?■**$ -f^ •* ■$£*-*■ igr -ni ^9^^ J^^ ^^^^^^^t^ 



^^^^W^%^0 ^^?^ ^^^^^ % ^M**'^^ 



-163- 

ignato nodes in the current SOA. These names appear as 
labels for the nodes in fig. A. 2-1. 






«f 



aw ft 









a-X^VL sVrvt. **" 



«-pwciA 

T 

\ *Y' T . 
o i ... 






Pig. A. 2-1. Structure of a SOA 



Before giving a procedure which specifies the action of 
the BL interpreter, we define several auxiliary transforma- 
tions. These use the special names sh9$a^i^ |ig. A*?-3. ? 

PickActiveRoot -» Root : 

■ i i i i i ii i i . ii ■i)ii ii » H f»i . > i i « | i -■ i | i I 111 . 1 . i i-t; ii ^ w»i j^ ^ ^ . 

let a € R such that 3p € U: (a,'stat'3) € A & (0,1) € V; 
v' = v [a/root], U' - U # R' = R, A' = A, V = V. 

succ.-» nsxfe t 

v" = vfn+1/neastl, V «-U# R' =*R, A' - A, V - V, 
where k * v(k) « ■ -:,:■■ „ . .io): ■.'km: ■:■■•■■■■ - 
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GetNextlnatr ; 

DeleteElem(inum,k) ; 
AddElem(inum,next) . 

Jump(i) -» next ? [defined for v € {0,1,2,...}' = ELEM, 

where t, * v(i)3 

v* = vU/next], U' - U, R' - R, A» * A, V - V. 

Empty ( a) t [defined for a € U, where a = v(a)] 
if HasElem(a) 
then false 

else if S<y 6 SEL, p € U: (a,a,P) 6 A 
then false 
else: true. 

The action of the BL interpreter i* . specified; by the repe- 
titive application of the transformation given by the follow- 
ing procedure? 

PickActiveRoot -> root? /* pick an active root node */ 

comp(root, ^ep') -♦ clsr /* access the e*l»». via ep */ 
Comp(root, 'ip* ) -♦ ip; 

comp(ip, 'proc«) -♦ proced; /* access procedure structure */ 

Comp(ip, ' inst') -* inum; /* number of current instr. */ 

Elem(inum) ■♦ k? 

comp ( proced, k) -» instr /* fetch current instruction */ 

Su.cc -*■ next; /* set for ae** instruction */ 

ExecuteBLInstruction(inst) ; /* execute the instruction */ 

GetNextlnstr . /* reset ip for new instr. */ 



^g^^aex 



^^^;!^#^S 
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Finally, we define the operation of all the BL instruc- 
tions by giving the transformation ExecuteBLInstruction. 

ExecuteBLInstruction (inst) ; 

Comp(inst,0) -♦ operation; 

case operation of /* choose the action that matches* the 



'create' : 

Comp(inst, 1) -* x; 
DeleteComp(cls,x) ; 
NewComp(cls,x) -♦ a. 
'clear' : 

Corap(inst,l) -♦ x; 
MakeEmpty(cls,x) -» a. 
'delete' : 

Comp(inst,l) -♦ x; 
if -,HasComp(inst,2) 
then DeleteComp(cls,x) 
else (Comp(inst,2) -» in; 
if HasComp(cls,x) 

then fComp(cls,x) -> a; 

DeleteComp(a,m) } }. 
'const' : 

Comp(inst,l) -♦ v; 
Comp(inst,2) -» x; 
MakeEmpty(cls,x.) -♦ a; 
AddElem(a,v) . 
■add' : 

Comp(inst,l) ■+ x; 
Comp(inst,2) -» y; 



operation code of the instruction 

/* create x 



/* ei§S£ x 



/* delete x 
/* delete x,m 



/* const v f x 



*/ 
V 



*/ 



*/ 
*/ 



*/ 
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Comp(inst,3) -» zy /* add x,y,z */ 

Comp(cl«,x) -♦ a; Coiap(cl»,y) .-» by 
Elem(a)i *♦ dy Elem(b)-» ey 
MakeEmpty(els,z) -> cy 
A<kB51«a(c, v (*I)*y(e)) . 

/* otter arithmetic instructions are similar */ 

'link': 

Comp(inst,l) «♦ x; 

Corop(inst,2) -» ny 

Comp{i»#t,3) -♦ yy /* link *,n,y */ 

Camplcl***) -* a; Comp(cls,y) -♦ lot 

if HasElea(a) 

than fElem(a) -» d; DeleteElera(a,d) } 
else DeleteCoiap(a,n) y 
Add&rc(a^n,b) . 
'select's .-. ' , 

Comp(inst,l) -» xy 
Conp(inst«2) -* ny 

Comp(inst,3) ■♦ yy /* select x,n,y */ 

Co»p(cls,x) -» ay 
if -THasCO»p(a,n) 

them f $,, f HaaElem ( a) 

then (Elem(a) -» dy 

DeleteElem(a,d) }; 
Neweorap(a,n) .♦ b} 
else Go»p(a,n) -» b. 
'apply': 

Comp(inst,l) -» p; 
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Comp(inst,2) -♦ x; /»■ apply p,x */ 

Comp(cls,p) -» proc; Comp(cls,x) •+ arg; 

Comp(proc, '$text') -♦ t; 

NewNode '-■+ newsoa; 

NewCorap(newsoa, 'ep' ) -♦ newels; 

AddArc (newels, '$par' , arg) ; 

NewComp( newsoa, 'ip') -» newip; 

AddArc (newip, 'proc' ,t) ; 

NewComp( newip, ' inst') -> newinum; 

AddE lem (newinum, 1) ; 

NewComp ( newsoa, ' stat') -> newstat; 

AddE lem (news tat, 1) ; 

AddArc (newsoa, ' ret ', root) ; 

MakeRoot (newsoa) ; 

Comp( root, 'stat') -» stat; 

Dele teElemf stat, 1) ; AddElem(stat,0) . 
' return ' : 

Comp( root, 'ret') -» oldsoa; 

Comp(oldsoa, 'stat') -♦ olds tat; 

DeleteElem(oldstat,0) ; AddE lem (oldstat, 1) ; 

RemoveRoot (root) ; Prune, 
•move-" t 

Comp(inst,l) -♦ f; 

comp(inst,2) ■♦ x; /* move f,x */ 

Comp(proced, f) -» a; 

DeleteComp(cls,x) ; AddArc (cl8,X, a) . 
' goto ' : 

Comp(inst,l) -* &; /* gOtd j, */ 

Jump (z) -» next. 



i>:m*£jvsK±^ f^r^w&ig^iKrm'^y- 
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'elem?» : 






Coittp(inst,l) -» x; 


/_-, .\;r .; ■ • _• | ,- 




Comp(inst;2) -♦ A? 


;, ^/f-.^iffl^^«^ 


*/ 


Comp(cl«,x) -» a? 


-. ■'.*■ , "■'*■'■?•; ■,"■'-""• 


4 


i£ ->HaeElem(a) 


.■.'•-.■ ' ',•: 




then JumpUJ •♦' next. 






' empty? • t 






Comp(inat,l) -♦ x; 


•■■'-':;": ■'■'." ' 




Comp(inat,2) «♦ X; 


/* 4K»»&*«4 


*/ 


Compels, x) -4 a; 


■■ •''.■ ;■•.'• .■■■■ ■'■■. ' 




if -nEagpty(a) 






then Juwp(i) -» next. 


■■ - '■'•-'■' 




1 nonempty? ' : 


- ^'b Tf ; " A "' '/ , • 




Comp(in«t,l) -* x; 


■ ■ ■• ■ ■:.'■' ;■.'<•■■/■ '■ ■.;■•' 




Cong? ( in* t, 2) ,-* "jti 




I */ 


Corap(ela,x) -♦ a; 






if Empty(a) 






then JumpU) ■♦ next. 






•eq?'j 






Comp(inat,l) -» xy 


' \ .._. ..V ■ •' y : ' 




Conqa(in*t,2) -» y; 




i 


Con»p(ih»t,3) -• A; 


/* eg& x,y,x 


*/ 


glem(x) -♦ 4? Elem(y) -» e; 


• -■ 




if v (d) ^ v (e) 






then Jun$>U) ~* next. 


.,,..., 




■ha*?': 






Con>p(in*t,l) <-» x? 






Comp(in«t,2) -» m; 







te*^^;;$&i^ 
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Comp(inst,3) -♦ A; 
if -BaMCotep ( x , m) 

then jump (A) •-» next. 
same? 1 : 
Comp(inst,l) -♦ x; 
Comp(inst,2) -♦. y; 
Cou^Cinait^) -♦ 4; u , 
if v(x) * v(y) 

then Jiimp{ -i ) -♦ next . 



/* has? x,m,j& */ 



/*'i^Se? x,y7i V 



/* other comparison instructions are similar */ 

' getc* : > ■■■• ' -■■ 

Comp(inst,l) -» x; ■ '- '' " ; ' :: : --*'-^ 

Corop(in«t,2) ■* St .,;/•.,■.■";■■ -^ ,. ■: ?-'i--- ■■ - : >.a& — .^ 

Comp(inst,3) -* lj /*.gt*&*f*** ■ ' */ 

Comp(cls,x) -► a; Ma1ceEmpty(cls,i) -» b; 

■•■■ ■ ;•■ ..V.,'.l-. ■.•'■'.■..■■- . ■■■ .v -:.T V - *boj3 ;i -.::.'* 

if HasUnmarXedComps(a) 
' ■ tften- fqe^niaariMdaBttt^f ^ «» 'frfr ■ ■ 
Mark(a,s) ; 
AddEiem(b,s)} 
else {t^ar3tCompgQ€raVi 
JumpU) ■■»; next} . 



" Af ! ."* 



endcase 



This completes the definition of the transformation 
ExecuteBLInstruction. The qetc instruction, however, 
requires some special additional mechanisms, which we now 
show. 
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HasUnmarkedComps (a) : [defined provided a f 0» vfeere a = v (a)] 
if 3a C SEL: (a, CT ,P) € A for some -3 € U 
and a fL MAKKSET(a) 

then true else false. 

GetUnmarkedComp(a) -» s : {defined provided a € U and 

HasUnaarkedac»4iB(a> ^ true/ where 
a = v(a) ] 

let a € SEL be as in the HasUnaaUEfcedCoo©* predicate; 

Mark (a, a) ; [defined provided a € U and a € SEL, where 
a * v(a) , c = v(s)] 

MARKSET(a) «- MARKSET(a) U {a}. 

UnmarkCompsO f ( a) ; [defined provided a fU, where a « v(a)] 
MAKKSBTW «- 0. 

We observe that each node a € U has a set MARKSET(a) asso- 
ciated with it. All such markse^s &r» initially empty. 

There is one final remark to be made. Although our 
definitions of the BL instructions contain many composite 
transformations, the interpreter ifc to regard the effect of 
a BL instruction as an indivisible unit. 



